Knowledge Base - Tamil Nadu Critical Thinking Curriculum - India Institute

Knowledge Base

Resources Reference

Knowledge Base

Understanding the Research

Plain-language explanations of the core concepts behind the TNCT study. These are written for a general audience, including policymakers and officials who may not have a research background.

01 | Study Design

What is a Randomised Controlled Trial?

A randomised controlled trial (RCT) is a study design in which participants are assigned by chance to either receive an intervention or not. Random assignment is what sets an RCT apart from other study designs, and it is what makes causal inference possible.

In most observational studies, researchers compare people or groups who chose to participate in something with those who did not. The problem is that the groups may differ in important ways before the study even begins. People who enrol in an after-school tutoring programme, for example, may already be more motivated than those who do not, so any improvement in their performance cannot confidently be attributed to the programme itself.

An RCT solves this problem by removing choice from the equation. Participants are assigned to groups randomly, which means that on average the groups are statistically comparable at the start. Any differences in outcomes at the end of the study can then be attributed to the intervention with much greater confidence.

Key concepts

Randomisation

Assignment of participants to groups by chance, eliminating selection bias and making the groups comparable at baseline.

Treatment group

The group that receives the intervention being studied. In TNCT, these were the 19 schools whose students received CT lessons.

Control group

The comparison group that does not receive the intervention. Their outcomes provide the counterfactual: what would have happened without the programme. More complex RCTs include more than one control group to enable additional assessments, for example estimating whether the intervention produced spillover effects on students who did not directly receive it.

Baseline measurement

Data collected before the intervention begins. It establishes where each group stands at the start, allows pre-existing differences to be controlled for, and provides the primary test of whether the treatment and control groups are comparable. In TNCT, baseline data confirmed that students in the two groups were on average comparable in terms of socio-economic indicators.

End-line measurement

Data collected from both groups after the intervention ends. The core analysis compares end-line outcomes between the treatment and control groups: that difference is the estimated treatment effect. Where baseline data exists, it improves precision by controlling for pre-existing differences between groups.

Treatment effect

The difference in outcomes between the treatment and control groups at end-line, after accounting for baseline differences. This is the estimated impact of the intervention.

How a cluster RCT works

Eligible pool

47 schools

Assessed for eligibility

→

Randomisation

41 schools

Assigned by chance to treatment or control

→

Treatment

19 schools

Received CT curriculum, 1 hr/week

Control

22 schools

Continued with regular schooling

→

End-line

Both groups assessed

CT reasoning and motivation survey

TNCT used a cluster RCT, meaning that whole schools, not individual students, were randomised. This is appropriate when an intervention is delivered at the group level and when individual randomisation would risk contamination (students in neighbouring classrooms talking to each other, for example). In a cluster RCT, statistical analysis must account for the fact that students within the same school are more similar to each other than to students in other schools.

Why RCTs are considered the gold standard

RCTs are widely regarded as the most reliable method for establishing whether an intervention works. This is not because they are the only valid form of evidence, but because random assignment is the most credible way to rule out the alternative explanations that undermine observational studies. In development economics and education research, the past two decades have seen a major shift towards RCT-based evidence, in part through the work of J-PAL (the Abdul Latif Jameel Poverty Action Lab), which has co-ordinated hundreds of RCTs across low- and middle-income countries.

02 | Inference

What is a Causal Claim?

A causal claim asserts that one thing caused another. This is a stronger and more policy-relevant statement than saying two things are associated or correlated. The difference matters enormously when deciding whether to fund, scale, or recommend a programme.

Correlation is not causation

Two things are correlated when they tend to move together. Cities with more hospitals tend to have higher death rates. That is a correlation. But it would be wrong to conclude that hospitals cause death. The real explanation is that sick people go to hospitals: the correlation is driven by a third variable, not a causal link between the two.

In education research, similar confounders are common. Schools that adopt new teaching methods may be the same schools with more motivated teachers and better-resourced parents. Any improvement in student outcomes could reflect those pre-existing advantages, not the new method. Without carefully controlling for these confounders, correlational evidence cannot distinguish between the two explanations.

What makes a causal claim credible

Element	What it requires
A credible counterfactual	What would have happened without the intervention? The control group in an RCT provides a direct answer.
Ruling out confounders	Randomisation eliminates systematic differences between groups before the intervention begins.
Ruling out chance	Statistical analysis establishes whether the observed difference is larger than would be expected from random variation alone.
Ruling out bias in measurement	End-line assessments are ideally administered by people who do not know which group each student belongs to, and evaluated independently of the delivery team.
Accounting for attrition	If students who leave the study differ systematically between treatment and control groups, the estimate of the treatment effect may be biased.

How TNCT addresses this

TNCT's end-line assessment was administered by trained volunteers from the Illam Thedi Kalvi programme who were not assigned to schools they were already familiar with. Invigilators worked in pairs and were not part of the curriculum delivery team. Each script was then evaluated independently by two trained CT teachers following a detailed rubric. Where the scores assigned by the two evaluators did not match, a third evaluator adjudicated. These steps were taken to reduce the risk of scorer bias, even though evaluators could see student names on the scripts.

The limits of a causal claim

Even a well-conducted RCT establishes causation only within the context of the study. The finding that a CT curriculum improved reasoning scores among grades 8 and 9 students in one urban education block in Tamil Nadu does not automatically mean it will work in a rural block, in a different state, or at a different age group. Causal claims are context-specific, and the question of whether findings generalise (called external validity) is always worth asking.

This is one reason why the TNCT team proposes an expanded pilot across multiple education blocks as the next step, rather than recommending immediate state-wide rollout.

03 | Research Quality

How to Recognise Good Research

Not all research is equally reliable. These are the markers that distinguish credible education research from studies that may be methodologically weak, selectively reported, or otherwise difficult to interpret.

A clear research question

Good research begins with a well-defined question and, where relevant, a plausible theory of change. Vague or post-hoc research questions are a warning sign. In TNCT, the research question and both hypotheses were specified before baseline data collection began.
Method appropriate to the question

Different questions require different methods. RCTs are the most reliable design for causal impact questions, but other rigorous approaches exist for questions about implementation, context, cost-effectiveness, or the experiences of participants. A study that uses a method poorly suited to its question produces unreliable answers regardless of how carefully it is executed.
Pre-registration

A credible study registers its hypotheses, methods, and analysis plan with an independent registry before data collection begins. This prevents researchers from sifting through results after the fact and reporting only the findings that happened to be significant (a practice known as p-hacking). TNCT's hypotheses were specified in the original proposal before the baseline was conducted.
A meaningful control group

Good research needs a genuine comparison. A control group that received nothing, or an active control that received an alternative intervention, makes it possible to isolate the effect of the specific programme being evaluated. Studies without a comparison group can only tell you that outcomes changed over time, not why.
Adequate sample size

Small studies can produce misleading results by chance. A well-powered study is large enough to reliably detect an effect of a practically meaningful size. For cluster RCTs, the relevant unit of analysis is the cluster (the school), not the individual student. TNCT enrolled 41 schools across one education block. Where sample sizes are less than ideal for whatever reason, bootstrapping methods can be used: these repeatedly resample the available data to build an empirical distribution of the estimate, producing more reliable confidence intervals than standard parametric approaches when sample sizes are small.
Effect size, not only statistical significance

Statistical significance tells you that an observed difference is unlikely to be due to chance. It says nothing about how large or practically important the difference is. A study with a very large sample can detect a statistically significant effect that is too small to matter in practice. Effect sizes, expressed in standardised units or as percentage-point differences, give a more complete picture.
Honest reporting of limitations

Good research reports what did not work alongside what did. It acknowledges design constraints, attrition, deviations from the original plan, and hypotheses that could not be tested. In TNCT, Hypothesis 2 could not be formally tested because elections advanced the year-end exam schedule. This is reported directly rather than omitted.
Independence from the programme provider

Research conducted or evaluated entirely by the organisation delivering the programme is more susceptible to motivated reasoning, even without deliberate bias. Independent data collection, evaluation, and analysis provide a stronger basis for claims. The TNCT end-line was administered and evaluated by personnel independent of the curriculum delivery team.
Transparent funding and conflicts of interest

Knowing who funded a study, and whether the funder has a financial or reputational interest in the outcome, is relevant to interpreting the results. TNCT was funded by an independent individual philanthropist with no stake in the curriculum being commercialised or scaled.
Replication

A single study, however well conducted, is one data point. Confidence in a finding grows when it has been replicated across different settings, implementers, and populations. TNCT's Phase 1 RCT is designed to provide proof-of-concept evidence. An expanded pilot across multiple blocks (Phase 2) would test whether the findings hold in a wider range of schools before state-wide policy decisions are made.

A note on evidence hierarchies

RCTs occupy the top of the evidence hierarchy for establishing causal effects, but they are not the only form of valuable evidence. Qualitative research, implementation studies, and teacher feedback all provide information that RCTs cannot. The TNCT project combines RCT-based impact evaluation with classroom observation, student focus groups, and a pre-baseline proficiency survey precisely because impact data alone does not explain how or why a programme works.