Understanding the Research
Plain-language explanations of the core concepts behind the TNCT study. These are written for a general audience, including policymakers and officials who may not have a research background.
What is a Randomised Controlled Trial?
A randomised controlled trial (RCT) is a study design in which participants are assigned by chance to either receive an intervention or not. Random assignment is what sets an RCT apart from other study designs, and it is what makes causal inference possible.
In most observational studies, researchers compare people or groups who chose to participate in something with those who did not. The problem is that the groups may differ in important ways before the study even begins. People who enrol in an after-school tutoring programme, for example, may already be more motivated than those who do not, so any improvement in their performance cannot confidently be attributed to the programme itself.
An RCT solves this problem by removing choice from the equation. Participants are assigned to groups randomly, which means that on average the groups are statistically comparable at the start. Any differences in outcomes at the end of the study can then be attributed to the intervention with much greater confidence.
TNCT used a cluster RCT, meaning that whole schools, not individual students, were randomised. This is appropriate when an intervention is delivered at the group level and when individual randomisation would risk contamination (students in neighbouring classrooms talking to each other, for example). In a cluster RCT, statistical analysis must account for the fact that students within the same school are more similar to each other than to students in other schools.
RCTs are widely regarded as the most reliable method for establishing whether an intervention works. This is not because they are the only valid form of evidence, but because random assignment is the most credible way to rule out the alternative explanations that undermine observational studies. In development economics and education research, the past two decades have seen a major shift towards RCT-based evidence, in part through the work of J-PAL (the Abdul Latif Jameel Poverty Action Lab), which has co-ordinated hundreds of RCTs across low- and middle-income countries.
What is a Causal Claim?
A causal claim asserts that one thing caused another. This is a stronger and more policy-relevant statement than saying two things are associated or correlated. The difference matters enormously when deciding whether to fund, scale, or recommend a programme.
Two things are correlated when they tend to move together. Cities with more hospitals tend to have higher death rates. That is a correlation. But it would be wrong to conclude that hospitals cause death. The real explanation is that sick people go to hospitals: the correlation is driven by a third variable, not a causal link between the two.
In education research, similar confounders are common. Schools that adopt new teaching methods may be the same schools with more motivated teachers and better-resourced parents. Any improvement in student outcomes could reflect those pre-existing advantages, not the new method. Without carefully controlling for these confounders, correlational evidence cannot distinguish between the two explanations.
| Element | What it requires |
|---|---|
| A credible counterfactual | What would have happened without the intervention? The control group in an RCT provides a direct answer. |
| Ruling out confounders | Randomisation eliminates systematic differences between groups before the intervention begins. |
| Ruling out chance | Statistical analysis establishes whether the observed difference is larger than would be expected from random variation alone. |
| Ruling out bias in measurement | End-line assessments are ideally administered by people who do not know which group each student belongs to, and evaluated independently of the delivery team. |
| Accounting for attrition | If students who leave the study differ systematically between treatment and control groups, the estimate of the treatment effect may be biased. |
TNCT's end-line assessment was administered by trained volunteers from the Illam Thedi Kalvi programme who were not assigned to schools they were already familiar with. Invigilators worked in pairs and were not part of the curriculum delivery team. Each script was then evaluated independently by two trained CT teachers following a detailed rubric. Where the scores assigned by the two evaluators did not match, a third evaluator adjudicated. These steps were taken to reduce the risk of scorer bias, even though evaluators could see student names on the scripts.
Even a well-conducted RCT establishes causation only within the context of the study. The finding that a CT curriculum improved reasoning scores among grades 8 and 9 students in one urban education block in Tamil Nadu does not automatically mean it will work in a rural block, in a different state, or at a different age group. Causal claims are context-specific, and the question of whether findings generalise (called external validity) is always worth asking.
This is one reason why the TNCT team proposes an expanded pilot across multiple education blocks as the next step, rather than recommending immediate state-wide rollout.
How to Recognise Good Research
Not all research is equally reliable. These are the markers that distinguish credible education research from studies that may be methodologically weak, selectively reported, or otherwise difficult to interpret.
-
A clear research questionGood research begins with a well-defined question and, where relevant, a plausible theory of change. Vague or post-hoc research questions are a warning sign. In TNCT, the research question and both hypotheses were specified before baseline data collection began.
-
Method appropriate to the questionDifferent questions require different methods. RCTs are the most reliable design for causal impact questions, but other rigorous approaches exist for questions about implementation, context, cost-effectiveness, or the experiences of participants. A study that uses a method poorly suited to its question produces unreliable answers regardless of how carefully it is executed.
-
Pre-registrationA credible study registers its hypotheses, methods, and analysis plan with an independent registry before data collection begins. This prevents researchers from sifting through results after the fact and reporting only the findings that happened to be significant (a practice known as p-hacking). TNCT's hypotheses were specified in the original proposal before the baseline was conducted.
-
A meaningful control groupGood research needs a genuine comparison. A control group that received nothing, or an active control that received an alternative intervention, makes it possible to isolate the effect of the specific programme being evaluated. Studies without a comparison group can only tell you that outcomes changed over time, not why.
-
Adequate sample sizeSmall studies can produce misleading results by chance. A well-powered study is large enough to reliably detect an effect of a practically meaningful size. For cluster RCTs, the relevant unit of analysis is the cluster (the school), not the individual student. TNCT enrolled 41 schools across one education block. Where sample sizes are less than ideal for whatever reason, bootstrapping methods can be used: these repeatedly resample the available data to build an empirical distribution of the estimate, producing more reliable confidence intervals than standard parametric approaches when sample sizes are small.
-
Effect size, not only statistical significanceStatistical significance tells you that an observed difference is unlikely to be due to chance. It says nothing about how large or practically important the difference is. A study with a very large sample can detect a statistically significant effect that is too small to matter in practice. Effect sizes, expressed in standardised units or as percentage-point differences, give a more complete picture.
-
Honest reporting of limitationsGood research reports what did not work alongside what did. It acknowledges design constraints, attrition, deviations from the original plan, and hypotheses that could not be tested. In TNCT, Hypothesis 2 could not be formally tested because elections advanced the year-end exam schedule. This is reported directly rather than omitted.
-
Independence from the programme providerResearch conducted or evaluated entirely by the organisation delivering the programme is more susceptible to motivated reasoning, even without deliberate bias. Independent data collection, evaluation, and analysis provide a stronger basis for claims. The TNCT end-line was administered and evaluated by personnel independent of the curriculum delivery team.
-
Transparent funding and conflicts of interestKnowing who funded a study, and whether the funder has a financial or reputational interest in the outcome, is relevant to interpreting the results. TNCT was funded by an independent individual philanthropist with no stake in the curriculum being commercialised or scaled.
-
ReplicationA single study, however well conducted, is one data point. Confidence in a finding grows when it has been replicated across different settings, implementers, and populations. TNCT's Phase 1 RCT is designed to provide proof-of-concept evidence. An expanded pilot across multiple blocks (Phase 2) would test whether the findings hold in a wider range of schools before state-wide policy decisions are made.
RCTs occupy the top of the evidence hierarchy for establishing causal effects, but they are not the only form of valuable evidence. Qualitative research, implementation studies, and teacher feedback all provide information that RCTs cannot. The TNCT project combines RCT-based impact evaluation with classroom observation, student focus groups, and a pre-baseline proficiency survey precisely because impact data alone does not explain how or why a programme works.