Ceteris Non Paribus

Ceteris Non Paribus is my personal blog, formerly hosted at nonparibus.wordpress.com and now found here. This blog is a place for me to put the ideas I have, and the stuff I come across, that I’ve managed to convince myself other people would be interested in seeing. See the About page for more on the reasons why I maintain a blog and the origin of the blog’s name.

My most recent posts can be found below, and a list of my most popular posts (based on recent views) is on the right.

Ceteris Non Paribus

You still have to test the results of your evolutionary learning strategy

Ricardo Hausmann argues against RCTs as a way to test development interventions, and advocates for the following approach instead of an RCT that distributes tablets to schools:

Consider the following thought experiment: We include some mechanism in the tablet to inform the teacher in real time about how well his or her pupils are absorbing the material being taught. We free all teachers to experiment with different software, different strategies, and different ways of using the new tool. The rapid feedback loop will make teachers adjust their strategies to maximize performance.

Over time, we will observe some teachers who have stumbled onto highly effective strategies. We then share what they have done with other teachers.

This approach has a big advantage over a randomized trial, because the design can adapt to circumstances that are specific to a given classroom. And, Hausmann asserts, it will yield approaches whose effectiveness is unconfounded by reverse causality or selection bias.

Clearly, teachers will be confusing correlation with causation when adjusting their strategies; but these errors will be revealed soon enough as their wrong assumptions do not yield better results.

Is this true? If so, we don’t need to run any more slow, unwieldy randomized trials. We can just ask participants themselves what works! Unfortunately, the idea that participants will automatically understand how well a program works is false. Using data from the Job Training Participation Act (JTPA), Smith, Whalley and Wilcox show that JTPA participants’ perceived benefits from the program were unrelated to the actual effects measured in an RCT. Instead, they seem to reflect simple before-after comparisons of outcome variables, which is a common mistake in program evaluation. The problems with before-after comparisons are particularly bad in a school setting, because all student performance indicators naturally trend upward with age and grade level.

An adaptive, evolutionary learning process can generate a high-quality program – but it cannot substitute for rigorously evaluating that program. Responding to Hausmann, Chris Blattman says “In fact, most organizations I know have spent the majority of budgets on programs with no evidence whatsoever.” This is true even of organizations that do high-quality learning using the rapid feedback loops described by Hausmann: many of the ideas generated by those processes don’t have big enough effects to be worth their costs.

That said, organizations doing truly effective interventions do make use of these kinds of rapid feedback loops and nimble, adaptive learning processes. They begin by looking at what we already know from previous research, come up with ideas, get feedback from participants, and do some internal M&E to see how things are going, repeating that process to develop a great program. Then they start doing small-scale evaluations – picking a non-randomized comparison group to rule out simple time trends, for example. If the early results look bad, they go back to the drawing board and change things. If they look good, they move to a bigger sample and a more-rigorous identification strategy.

An example of how this works can be found in Mango Tree Educational Enterprises Uganda, which developed a literacy program over several years of careful testing and piloting. Before moving to an initial RCT, they collected internal data on both their pilot schools and untreated schools nearby, which showed highly encouraging results. The results from the first-stage RCT, available in this paper I wrote with Rebecca Thornton, are very impressive.

My impression is that most organizations wait far too long to even start collecting data on their programs, and when the results don’t look good they are too committed to their approach to really be adaptive. The fundamental issue here is that causal inference is hard. That’s why it took human beings hundreds of thousands of years to discover the existence of germs. Social programs face the same problems of noisy data, omitted variables, strong time trends, and selection bias – and arguably fare even worse on those dimensions. As a result, no matter how convincing a program is, and how excellent its development process, we still need randomized experiments to know how effective it is.


Liquidity constraints can be deadly

My friend and coauthor Lasse Brune sent me FSD Kenya‘s report on the Kenya Financial Diaries project.  Much of the report, authored by Julie Zollman, confirms that what we already knew to be true for South Africa’s poor (based on the excellent book Portfolios of the Poor) also holds for low-income people in Kenya. I was struck, however, Zollman’s description of the stark consequences Kenyans face if they are unable to access financial markets:

We observed important delays – in some cases resulting in death – from respondents’ inability to finance emergency health spending. In some cases, the financial barriers were substantial – above KSh 20,000 – but in many cases, important delays in treating things like malaria and other infections stemmed from only very small financial barriers, below KSh 500.

500 shillings is just under five US dollars at market exchange rates. Even in the context of the financial lives of the study’s respondents, this seems to be an entirely manageable sum. For the average respondent 500 shillings is less than a third of monthly per-person income. FSD’s poorest respondents, who earn 221 shillings a month, should still be able to afford this cost over the course of a year; it represents 19% of their total budget. That is an amount they should be able to borrow and pay back, or possibly even to save up. Certainly this would be achievable in the developed world.

However, Kenya’s poor are desperately liquidity-constrained:

[E]ven if household income and consumption were steady from month to month, families could still be vulnerable if they could not increase spending to accommodate new and lumpy needs. A quarter of all households experienced actual hunger during the study and 9% of them experienced it during at least three interview intervals. Thirty-eight per cent went without a doctor or medicine when needed and 11% of households experienced this across three or more interview intervals.

Some of my own work (cowritten with Lasse) contributes to an extensive literature that documents the economic costs of liquidity constraints in Africa. These are important, but we should bear in mind that an inability to access credit or savings can also literally be the difference between life and death.

Is it ethical to give money to the poor? Almost certainly

GiveDirectly, the awesome charity that gives cash directly to poor people in developing countries, is back in the news. This time, they are getting attention for a study that uses the random variation in individual income provided by their original experiment to study whether increases in in the wealth of others harm people’s subjective well-being. This is an assumption that underlies a surprisingly popular theory that posits that increases in inequality per se have a direct impact on people’s health.* The most popular formulation of that theory says that inequality raises stress levels, which in turn cause poor health. This theory relies on inequality having a direct effect on stress and other measures of subjective well-being. Given how popular the stress-based theory is, there is surprisingly little evidence of this core claim.

Enter Haushofer, Reisinger, and Shapiro. Using the fact that the GiveDirectly RCT didn’t target all households in a given village, and randomized the intensity of the treatment, they can estimate the subjective well-being impacts not only of one’s own wealth but also of one’s neighbors’ wealth. Their headline result is that the effect of a $100 increase in village mean wealth is a 0.1 SD decline in life satisfaction – which is four times as big as the effect of own wealth. However, that effect declines substantially over time due to hedonic adaptation. This is an important contribution in a literature that lacked credible evidence on this topic.

However, not everyone is happy that they even looked at this question. Anke Hoeffler raises a concern about the very existence of this study.

I was just overwhelmed by a sense that this type of research should not be done at all. No matter how clever the identification strategy is. Am I the only one to think that is not ethical dishing out large sums of money in small communities and observing how jealous and unhappy this makes the unlucky members of these tight knit communities? In any case, what do we learn from a study like this? The results indicate that we should not single out households and give them large sums of money. So I hope this puts an end to unethical RCTs like this one.

I think this study was fundamentally an ethical one to do, for two distinct reasons: additionality and uncertainty.

By “additionality”, I am referring to the fact that this study uses existing data from an experiment that was run for a totally different reason: to assess the effects of large cash transfers to the poor on their spending, saving, and income. No one was directly exposed to any harm for the purpose of studying this question. Indeed, these results need to be assessed in light of the massive objective benefits of these transfers. One reading of this paper is that we should be more skeptical of subjective well-being measures, since they seem to be understating the welfare benefits of this intervention.

Second, uncertainty. We had limited ex ante knowledge of what this intervention would even do. Assuming that the impacts on subjective well-being and stress could only be negative takes, I think, an excessively pessimistic view of human nature. Surely, some people are happy for their friends. The true effect is probably an average of some positives and some negatives.

Indeed, the study has results that are surprising if one assumes the stress model must hold. What we learned from this study was much more subtle and detailed than a simple summary of the results can do justice to. First off, the authors emphasize that the effects on psychological well-being, positive or negative, fade away fairly quickly (see graph below). Second, the authors can also look at changes in village-level inequality, holding own wealth and village-mean wealth constant. There the effects are zero – directly contradicting the usual statement of the stress theory. If one’s neighbors are richer, it does not matter if the money is spread evenly among them or concentrated at the top.

Figure 2 from Haushofer, Reisinger, and Shapiro (2015)
Figure 2 from Haushofer, Reisinger, and Shapiro (2015)

Third, the treatment effects are more mixed, and probably smaller, than the elasticity quoted above. There’s no ex ante reason to think life satisfaction is the most relevant outcome, and none of the other measures show statistically-significant effects. Indeed, if we take these fairly small and somewhat noisily-estimated point estimates as our best guesses at the true parameter values, we should also take seriously the effects on cortisol, a biomarker of actual changes in stress. Here, the impact is negative: an increase in village-mean wealth makes people (very slightly) less stressed. This result is even stronger when we focus just on untreated households, instead of including those who got cash transfers. One way of accounting for the variation in these estimates that improves the signal-to-noise ratio is to look at an index that combines several different outcomes. The effects on that index are still negative, but smaller compared with the direct effects of the transfer. If we look just at effects on people who did not get a transfer, that index declines by 0.01 SDs, which is only a third of the magnitude of the effect of one’s own wealth change (a 0.03 SD increase).

My conclusion from the evidence in the paper is twofold. First, there’s nothing here to suggest that the indirect effects of others’ wealth on subjective well-being are large enough to offset the  direct benefits of giving cash to the poor. Second, the evidence for direct effects of others’ wealth on subjective well-being is fairly weak. This paper is far and away the best evidence on the topic that I am aware of, but we can barely reject that the effect is zero, if at all. That casts doubt on the stress-and-inequality theory, and means that more research is certainly needed on the topic. More experiments that give money to the poor would be a great way to do that.

*As Angus Deaton explains in this fantastic JEL article, inequality has an obvious link to average health outcomes through its effect on the incomes of the poor. The stress-based theory posits that an increase in inequality would make poor people sicker even if their income remained constant.

PSA: Cut the whitespace off the edges of your PDFs to import them into TeX

These days in economics, TeX is more or less unavoidable. There’s really no better way to put equations into your documents, and it can do wonderful things with automatically numbering tables and hyperlinking to them from where they are referenced. Even if you don’t like it, your coauthors probably do. So using it is basically a must.

If you’re like me, though, you make your actual formatted tables in Excel, so you can customize them without spending hours writing and debugging code to get them to look right. This creates a problem when you want to put them into your paper: if you want to use TeX’s powerful automatic numbering features, you need to import each table as an image. You can do this using a normal picture format like PNG, but your tables will look pixelated when people zoom in. The best format to use is to import your tables as PDFs, which can scale perfectly. But PDFing your excel tables will print them to entire pages, with tons of white space. TeX will then import all the white space as well, which at best puts your table title oddly far from the content and at worst moves the table title and page number off the page.

To get around this you need to crop your PDFs from whole pages down to just the content. You can do this manually in Adobe, and I used to use a free tool amusingly titled “BRISS”. This is tedious, though, and adds another chance for human error.

Today I figured out how to use a totally automated option for PDF cropping – a Perl script called pdfcrop, which was written by Eric Doviak, an economist at CUNY Heiko Oberdiek, who is probably not an economist. Here’s how to use it in Windows. You can download the script here. Extract it and put it in the root directory of your C: drive or someplace else convenient.  pdfcrop also requires a Perl environment like Strawberry Perl (download link) and a TeX implementation like MikTeX (download link) to run. Once you get all the components installed, you will run it through the command prompt. (You can pull up the command prompt by hitting Windows+R, typing “cmd” and hitting Enter).

The command you want to run looks like this:

“C:\pdfcropscripts\pdfcrop\pdfcrop.pl” “PATH_TO_ORIGINAL” “PATH_TO_CROPPED_VERSION”

If your original tables is in “C:\Dropbox\PaperTitle\Jason’s_Awesome_Tables.pdf” then your command might look like

“C:\pdfcropscripts\pdfcrop\pdfcrop.pl” “C:\Dropbox\PaperTitle\Jason’s_Awesome_Tables.pdf” “C:\Dropbox\PaperTitle\Jason’s_Awesome_Tables_cropped.pdf”.

EDIT: This version of the command randomly stopped working for me. Here is one that I got working, that you might try if you hit the same error:

pdfcrop “C:\Dropbox\PaperTitle\Jason’s_Awesome_Tables.pdf” “C:\Dropbox\PaperTitle\Jason’s_Awesome_Tables_cropped.pdf”.

This works even on multi-page PDFs that contain many tables – it will spit out a single PDF file that has all the cropped tables in the same order. You can import the individual pages of the PDF into your TeX document using includegraphics. Doing this:


will give you just the 4th table, and you can change the page numbers to get the other ones.

Bonus font tip: if you’re making tables in Excel and want to match the default font used in TeX, it is called CMU Serif. The CMU fonts can be found on SourceForge here. Just download the fonts and drag them into Fonts window you can find under Control Panel.

EDIT: A friend pointed out that I had misattributed credit for pdfcrop – this is now fixed. It turns out there are two identically-named tools to do this. The one I linked to appears to be the better of the two since it takes off nearly 100% of the margins.

I also got some reports of errors running the command to call pdfcrop, and encountered issues myself. I have inserted an edited command that works as well (or instead, depending on your situation).

EDIT #2: When I imported this post to my new blog it dropped all the backslashes for some reason. I’ve edited the post to fix them and also correct some typos in the example commands.

Randomizing cash transfers: practical equipoise in social science experiments

When is it ethical to randomly determine the provision of a treatment to people in an experiment? Researchers designing such experiments often appeal to the principle of equipoise, which means  “that there is genuine uncertainty in the expert medical community over whether a treatment will be beneficial.” This principle captures a straightforward ethical logic: we know that ARVs are effective in treating AIDS, so it would be unethical to design an experiment that denies some AIDS patients access to ARVs. In contrast, until recently we did not know that starting ARVs immediately had health benefits for people with HIV who do know yet have AIDS. The experiment that proved that this was the case was entirely ethical to conduct.

This approach works well for medical trials, because doctors have the joint social roles of studying which treatments are effective and actually providing people with the treatments that work. Social scientists tend to be in a very different situation. First, relative to medical research, we are substantially less sure what the impacts of various interventions will be.** Second (despite what people might think about economists in particular) we just don’t have that much influence. There are many policies and programs that economists know to be effective that have little or no traction in terms of actual policies. An example that I am particularly fond of is the idea that if you want to maximize uptake and use of a product, you should give it away. The notion that people won’t “value” things they get for free is so popular that two economists actually had to go out and randomize prices for insecticide-treated bednets to prove that giving them away would work for fighting malaria. And still, I see people charging money for them.

The bednet study is a great example of what I am dubbing practical equipoise. I personally had no doubt, prior to that study, about which direction the demand curve for bednets sloped. Almost any economist would probably have felt the same way, so “clinical equipoise” did not hold. Yet policymakers felt strongly that user fees for the nets were a great thing. Practical equipoise is a situation where the science is in but policymakers don’t agree with it; while the scale of knowledge is not balanced, we can try to use persuasion and facts to balance the scale of actual practice. In this situation, randomized experiments are an ethical imperative – they benefit people who would otherwise have lost out, and also can help promote universal access to beneficial treatments in the future.

Practical equipoise holds for many policies that social scientists want to study, and particularly for cash transfers. There is a growing body of evidence that just giving money to poor people carries large benefits and few costs, leading prominent economists to advocate that evaluations of social programs should include a cash transfer arm as a basis of comparison. This is something we should probably do in general, and should definitely do as a way to evaluate giving people food handouts. Recent research by IFPRI comparing food vs. cash for helping the hungry has found that cash does much better; while their own discussion of their results has tried to take a moderate tone, the fact is that outcomes are similar for cash at a much lower cost. But people – especially the elites who make decisions about these things – remain broadly unconvinced that it is safe to give cash to poor people who need food. We are ethically obligated as scientists to push for randomized testing of cash transfers as an approach to food crises: we are highly confident they work, and need to do what we can to convince policymakers that we are right.

*Medical research is also focused on treatments that directly impact people’s well-being in an objectively-measurable way, which is relatively uncommon in social science research.

**A prominent example is the effect of moving kids from housing projects to richer ones, which was found to be surprisingly small and negative for some outcomes in analyses that focused on older children and people who volunteered to move. Recent work examining younger kids and on people who do not voluntarily sign up for housing vouchers has shown that there are in fact substantial benefits for those groups.

Unattributed quotes and the propagation of myths about homicide in America

“One study of a high-crime community in Boston found that 85% of gunshot victims came from a network of just 763 young men—or less than 2% of the local population.” So says The Economist in a review of the causes of a spike in the murder rate and potential solutions to the problem entitled “Midsummer Murder”. It’s an exciting finding – if we can identify this 2% of men and hit them with a targeted intervention, we can essentially solve America’s horrific homicide problem.

But a friend of mine pointed out an odd discrepancy within the article: it states that only 14% of murder victims are gang members. How are there all these socially-connected young men committing almost all the violence, if they are not in a gang? And it is oddly unspecific about the details of the Boston study – who are these men? Are they black, like most of the other young men discussed in the piece? To sort these things out, I googled the entire sentence to see if I could find the original study.

What I found instead was the original sentence. Papachristos and Wildeman (2014) write:

For example, a recent study of a high-crime community in Boston found that 85% of all gunshot injuries occurred entirely within a network of 763 young minority men (< 2% of the community population)

The Economist just took this sentence and did synonym replacement. Their version is quite clearly a weakly paraphrased version of the original.

Interested in the details of what was going on, I dug up the actual study being cited here, which is also by Papachristos (with Braga and Hureau).* And it turns out that the  The Economist’s quoted-without-attribution summary doesn’t even appear to be accurate. The study only looked at 763 people – not the whole community. And the 85% figure is the share of gunshot victims within the 763-person sample who are socially-connected to each other somehow. I see no mention in the article of the 2% figure that is cited by Papachristos and Wildeman (2014)  and The Economist. To be sure I didn’t overlook it, I used ctrl-F to look through the entire article for “two”, “2”, “%”, and “percent”. It appears to be a mischaracterization of the statement in the Papachristos et al. (2012) abstract that “The findings demonstrate that 85 % all of the gunshot injuries in the sample occur within a single social network”.

It might also be a true statement that is just not supported by the underlying article. Papachristos presumably has the data, and sometimes people cite their own previous papers in order to reference a dataset. Usually when you do that, you say something like “authors calculations using data from


” but maybe it got edited out.

It turns out that Papachristos and Wildeman (2014) – the other article where the unattributed quote was pulled from – actually addresses the question we are interested in: how socially-concentrated are murders. They study a poor, black neighborhood in Chicago, and find that 41% of all murders happened in a network of just 4% of the sample. The rest of the murders are actually even more spread out than you might guess – they don’t happen within the “co-offending network” – the set of people who are tied together by having been arrested together. Instead, 59% of murders are are spread throughout the 70% of the community that is not linked together by crime. While there are patches of highly-concentrated homicides the overall problem is not very concentrated at all.

That is a much less sexy finding, that lends itself much more poorly to simple solutions to problems of violence. If homicide is highly-concentrated then we can rely entirely on targeted interventions like Richmond’s program to pay high-risk men to stay out of crime – a program directly motivated by evidence that a small number of men are responsible for almost all murders.

What The Economist did here is not exactly plagiarism, but it wouldn’t fly in an undergraduate writing course either. At the same time, I can definitely empathize with the (unnamed per editorial policy) author. They might have fallen into the trap of reading the original sentence and then having trouble coming up with a truly distinct way of saying the same thing – that’s definitely happened to me before. Maybe they were worried about screwing up the meaning of the sentence by changing it too much. Or perhaps the editor cut out a citation that was in there.

Whatever the reason for the lack of attribution, it has consequences: because the quote was just lifted from another piece, the author didn’t catch the fact that it was an erroneous description of the study’s findings. If the original source of the quote had been cited, other people might have noticed that it contradicts what the quote says, or thought it was odd that they cited the source paper’s description of another study but not its own findings. Lots of people read The Economist. Policymakers thinking about how to combat homicide could be misled by this false but eye-catching statistic into investing in approaches that probably won’t work, instead of things like cognitive behavioral therapy that actually make a difference.

*I also found out that the sample in question is indeed a sample of black men – about 50% Cape Verdean and 50% African American.

Adventures in research framing and promotion

The press release headline for a new paper reads “Study: Low status men more likely to bully women online”, and the original article (ungated) is titled “Insights into Sexism: Male Status and Performance Moderates Female-Directed Hostile and Amicable Behaviour”.

What is the source of these deep insights into sexism? As SirT6 put it on r/science,

“Low status” in the context of this research means guys who are bad at Halo 3.

The headlines, and the introductory portions of both the popular and academic versions of the article, state findings that are much more forceful and general than the actual article. (Incidentally the press release is also by the original authors, so they are responsible for what it says.) The core finding of the study is that the number of positive comments made by male players towards female players goes up as the male players skill rises. They find nothing for negative comments. It’s pretty hard to sell.

I find the paper itself pretty interesting, and I do think the results fit the authors’ hypotheses. I do take issue with their inclusion of players’ in-game kills and deaths as predictors in their regression: those are outcome variables, and putting them in your regression on the right-hand side will lead to biased coefficient estimates. I also think the graphs overstate their findings and should show the binned values of the actual datapoints, and the confidence intervals for their fitted curves. But it’s a nice contribution in a clever setting and it has value.

What I see as a much bigger problem with the paper is how the research is being pitched, and it’s really not the authors’ fault at all. A lot of media attention goes toward the systematic biases in scientific research due to publication bias and selective data analysis choices, but there is a huge elephant in the room: framing and salesmanship. The success of a paper often depends more on how the authors frame it (relative to the audience of the journal, their specific referees, and the wider public) than on the fundamental scientific value of the work. In part, this is another way of saying that successful scientists need to be good writers.

But it also means that there is an incentive to oversell one’s work. I see this quite a bit in economics, where papers that are fairly limited in their implications are spun in a way that suggests they are incredibly general. All the incentives point in this direction: it’s hard to make it into a top general-interest journal without claiming to say something that is generally interesting.

The spin here is particularly strong: the paper that shows that having a female teammate causes male Halo 3 players to make more positive comments if they are high-skilled and fewer if they are low skilled, and got sold as saying that males online in general engage in more bullying if they are low-status. It’s often more subtle, leaving even relatively knowledgeable readers clueless about the limitations of a work, and leading to conclusions about a topic that are unsupported by actual evidence.

Bad economics and rhino conservation

A firm called Pembient has a plan to combat rhino poaching by making synthetic rhino horn using keratin and rhino DNA. They can undercut prices for the real stuff by a long ways, and nearly half of a sample of people in Vietnam said they’d be willing to switch to their fake version. And they can do this without killing any rhinos. Sounds like something we should all support, right? Not according to the International Rhino Foundation. It’s executive director, Susie Ellis, makes the following pair of specious claims:

Selling synthetic horn does not reduce the demand for rhino horn [and] could lead to more poaching because it increases the demand for “the real thing.”

Actually, yes it would reduce demand for real rhino horn. These two products are substitutes, and nearly perfect ones at that. This is quite literally Econ 101.

The second part of the statement is simultaneously less wrong and more ill-founded. It’s not wrong, in the sense that I am sure no one has systematically studied whether access to synthetic rhino horn (a product that does not yet exist) helps give people stronger preferences for the “real thing” (a product that is illegal and must be damnably hard to do research on). But this objection is based on nothing more than idle speculation.

Ellis goes on:

And, importantly, questions arise as to how law enforcement authorities will be able to detect the difference between synthetic and real horn, especially if they are sold as powder or in manufactured products.

What she is proposing is the ideal outcome for the synthetic-horn intervention. They are going to sell the fake stuff at one-eighth the price of real horns. If the difference is undetectable, that would eliminate the strong incentives poachers have to kill rhinos.

I am not claiming that none of these things will happen a little bit. Rather, my point is that for Pembient’s plan to be harmful on net, all of these other potential effects that Ellis came up with – the idea that synthetic rhino horn is somehow not a substitute for real horn, that it is a “gateway drug” to the real thing, and that it would provide a way for real horn purveyors to hide their ill-gotten goods – would have to overwhelm a massive drop in price for rhino horn, which would drastically reduce supply of the real, more expensive thing.

The alternative is the status quo: we stick with the existing strategies we are using to combat rhino poaching. Let’s take a look at how that is working out. The following graph, from the original Quartz piece, shows the annual number of rhinos poached each year in South Africa.


As a percentage of total living rhinos in the world, these numbers are terrifying. There are about 20,000 rhinos in Africa today, according to the most recent counts I could find on Wikipedia (white rhinos black rhinos). If this rate of increase in poaching continues, both the white and black rhino will be driven extinct within our lifetimes. We need to do more than staying the course and hoping for the best. And anyone who wants to object to actual solutions people propose for this problem should base their complaints on sound reasoning and real evidence.

Nigeria is going to be the most important country in the world

I recently came across this article from the Washington Post that presents graphs of population projections through 2100. The writing seems overly pessimistic to me; it has an attitude toward African governance and economic progress that rang true in 1995, but is outdated and incorrect in 2015.

That said, the graphs are great, and fairly surprising. Especially this one:


They are projecting Nigeria’s population to grow to 914 million people by 2100. Even if the truth is just half that figure, Nigeria will draw just about even with the US as the third-most-populous country in the world. Moreover, Nigeria is forecast to be the only country in the top 5 that will be unambiguously growing in population over the current century, making it a source of dynamism and labor supply for the world economy.

Based on my rigorous, in-depth investigation strategy of listening to African pop music and occasionally catching an episode of Big Brother Africa, Nigeria already plays an outsized role in an increasingly salient pan-African culture. The growth of Africa, and the rising importance of Nigeria within Africa (currently one in every seven Africans is Nigerian, a figure that will rise to one in four), mean that its importance is only going to rise.

There will be challenges: for the sake of the environment and the good of its people, the continent needs to urbanize and move away from subsistence agriculture. But the 21st century is shaping up to be an exciting one, and a positive one for Africans in general and Nigerians in particular.

Why do we ever believe in scientific evidence?

Late last night the internet exploded with the revelation that the most influential social science study of 2014 was apparently invented from whole cloth. LaCour and Green (2014) showed that a brief interaction with an openly-gay canvasser was enough to change people’s minds – turning them from opposing gay rights to supporting them. The effects were persistent, and massive in magnitude. This study was a huge deal. It was published in Science. It was featured on This American Life. I told my non-academic friends about it. And, according to an investigation by Broockman, Kalla, and Aronow, as well as a formal retraction by one of the study’s authors, it was all made up. The report by Broockman, Kalla, and Aronow is a compelling and easy read – I strongly recommend it.

The allegation is that Michael LaCour fabricated all of the data in the paper by drawing a random sample from an existing dataset and adding normally-distributed errors to it to generate followup data. I have nothing to add to the question of whether the allegation is true, other than to note that many people are persuaded by the evidence, including Andrew Gelman, Science, and LaCour’s own coauthor, Donald Green.

What I do have to add is some thoughts on why I trust scientists. Laypeople often think that “peer review” means the kind of analysis that Broockman, Kalla, and Aronow did – poring over the data, looking for mistakes and fraud. That isn’t how it works. Referees are unpaid, uncredited volunteers who don’t have time  to look at the raw data themselves. (I have also never been given the raw data to look at when reviewing an article). The scientific method is fundamentally based on trust – we trust that other scientists aren’t outright frauds.* Nothing would work otherwise.

Why do we trust each other? After all, incidents like this one are not unheard of. Speaking from my own experience running field experiments, one important reason is that faking an entire study would be really hard. You’d have got to write grants, or pretend you’re writing grants. Clear your study with the IRB, or sometimes a couple of IRBs. You’d have to spend a significant portion of your life managing data collection that isn’t really happening, and build a huge web of lies around that. And then people are going to want to see what’s up. This American Life reports that LaCour was showing his results to the canvassers he worked with, while the data was coming in (or supposedly coming in). To convincingly pull all of this off, you would basically have to run the entire study, only not collect any actual data.

It is hard to imagine anyone who is realistic with themselves about the incentives they face choosing to go through with all of this. Most scientists don’t get hired by Princeton, don’t make tons of money, don’t get their results blasted across all media for weeks. Most of them work really hard for small payoffs that seem inscrutable to outsiders. The only way to get big payoffs is with huge, sexy results – but huge results invite scrutiny. You might get away with faking small effects that are relatively expected, but if your study gets attention, people are going to start digging into your data. They will try to replicate your findings, and when they can’t they will ask questions. If you do manage to walk the tightrope of faking results and not getting caught, you did a ton of work for nothing.

I can barely conceive of going through all the effort and stress of running a field experiment only to throw all that away and make up the results. I trust scientists to be honest because the only good reason to go into science is because you love doing science, and I think that trust is well-placed.

*Incidentally, this is why I don’t particularly blame Green for not realizing what was up. When I coauthor papers with people, the possibility that they are just making stuff up never even crosses my mind. I am looking for mistakes, sharing ideas, and testing my own ideas and results out – not probing for lies. News accounts show that Green did see the data used for the analysis, just not the underlying dataset or surveys.