There is a lot of bad research out there. Huge fractions of the published research literature do not replicate, and many studies aren’t even worth trying to replicate because they document uninteresting correlations that are not causal. This replication crisis is compounded by a “scaleup crisis”: even when results do replicate, they often do not hold at any appreciable scale. These problems are particularly bad in social science.
What can we do about the poor quality of social science research? There are a lot of top-down proposals. We should have analysis plans, and trial registries. We should subject our inferences to multiple testing adjustments. It is very hard to come up with general rules that will fix these problems, however. Even in a world where every analysis is pre-specified and all hypotheses are adjusted for multiple testing, and where every trial is registered and the results reported, people’s attention and time are finite. The exciting result is always going to garner more attention, more citations, and more likes and retweets. This “attention bias” problem is very difficult to fix.
When you are doing randomized program evaluations in developing countries, however, there is a bottom-up solution to this problem: getting the right answer really matters. Suppose you run an RCT that yields a sexy but incorrect result, be it due to deliberate fraud, a coding error, an accident of sampling error, a pilot that won’t scale, or a finding that holds just in one specific context. Someone is very likely to take your false finding and actually try and do it. Actual, scarce development resources will be thrown at your “solution”. Funding will go toward the wrong answer instead of the right ones. Finite inputs like labor and energy will be expended on the wrong thing.
And more than in any other domain of social science, doing the wrong thing will make a huge difference. The world’s poorest people live on incomes that are less than 1% of what we enjoy here in America. We could take this same budget and just give it to them in cash, which would at a minimum reduce poverty temporarily. The benefits of helping the global poor, in terms of their actual well-being, are drastically higher than those of helping any group in a rich country. $1000 is a decent chunk of change in America, but it could mean the difference between life and death for a subsistence farmer in sub-Saharan Africa. Thus, when you get an exciting result, you have an obligation to look at your tables and go “really?”
This does not mean that no development economics research is ever wrong, or that nobody working in the field ever skews their results for career reasons. Career incentives can be powerful, even in fields with similar imperatives for honesty: witness the recent exposure of fraudulent Alzheimer’s research, which may have derailed drug development and harmed millions of people. What it means is that those career incentives are counterbalanced by a powerful moral imperative to tell the truth.
Truth-telling is important not just about our own work, but (maybe moreso) when we are called upon to summarize knowledge more broadly. Literature reviews in development economics aren’t just academically interesting; they have the potential to reshape where money gets spent and which programs get implemented. What I mean by honesty here is that when we talk to policymakers or journalist or lay people about which development programs work, we shouldn’t let our views be skewed by our own research agendas or trends in the field. For example, I have written several papers about a mother-tongue-first literacy program in Uganda, the NULP. The program works exceedingly well on average, although it is not a panacea for the learning crisis. People often ask me whether mother-tongue instruction is the best use of education funds, and I tell them no—I do not think it was the core driver of the NULP’s success, and studies that isolate changes in the language of instruction support that view. Note the countervailing incentives I face here: more spending on mother-tongue instruction might yield more citations for my work, and the approach is very popular so I am often telling people what they don’t want to hear. But far outweighing those is the fact that what I say might really matter, and getting it wrong means that kids won’t learn to read. This is a powerful motive to do my best to get the right answer.
Honesty in assessing the overall evidence also mitigate the “attention bias” problem. Exciting results will still get bursts of attention, but when we are called upon to give our view of which programs work best, we can and should focus on the broader picture painted by the evidence. This is especially critical in development economics, where we aren’t just seeking scientific truths but trying to solve some of the world’s most pressing problems.