Steps toward my stated goal of killing every mosquito

A bunch of heroes at a company called Oxitec are developing engineered male mosquitoes that will mate with the females and produced offspring that can’t develop properly (the full article appears to be gated, but you can still see a synopsis). I’m curious about the evolutionary biology of this intervention: the offspring of those males are by definition less fit, so in a natural equilibrium they’d be bred out of existence. I guess we can overcome that by constantly pumping these dud males into the ecosystem.

The biggest constraint here is reactions like that of the of people in Key West. Genetic modification has a big PR problem in general (how many people realize that the bananas we eat are heavily genetically modified [through selective grafting] and actually cannot reproduce on their own? Or that the same is true of almost every food?) and there’s a natural tendency these days for people to just agitate against any environmental change. We need better ideas for getting through to folks like this. I’m as anti-exploitation as the next guy, but maybe a campaign with pictures of dying children, along with explanations of the evidence on mosquitoes’ limited benefits to the overall ecosystem? I mean, kids are actually dying of malaria (and now dengue) on a massive scale, so it seems reasonable in this case.

Via Ophira Vishkin and Justin Schon.

Awesome links

Cool stuff I don’t have time to expound on at length:

  1. Via Ryan Louie on Facebook, “What would happen if you tried to hit a baseball pitched at 90% the speed of light?” what if xkcd is now on my Google Reader list.
  2. Is it culturally insensitive to badger a 23-year old Rwandan girl about her memories of the genocide that happened there when she was 5? Probably.
  3. The colonial era gets too much credit, good and bad, for current conditions in African countries. And arguably too much attention: I see the “colonial origins hypothesis” as too simplistic by far, and also as denying agency to the people in post-colonial countries. This refreshing new paper from Stelios Michalopoulos and Elias Papaioannou shows that pre-colonial ethnic institutions in Africa strongly predict current levels of economic development. Via Chris Blattman.
  4. Applied microeconomists Justin Wolfers and Betsey Stevenson are coming to Michigan. Big win for my home department.
  5. Kim Dionne is giving a seminar at Texas A&M on the politics of Malawi’s fertilizer subsidy program – would definitely be there if I could, because the topic is fascinating. Every Malawian I’ve met has a (strong!) opinion on it.
  6. Stanford and UCLA Chemists take a step toward an HIV cure, through a treatment that can reactivate the latent virus that is hiding in human cells. I wonder if this stuff can cross the blood-brain barrier to get at the HIV hiding there?

Alleged benefits of HIV testing

One claimed benefit from getting tested for HIV that I’ve often heard, and may have even taught my students at Marurani and Maji Moto primary schools back in the day, is that “only by knowing your HIV status can you protect yourself and your partners.”

Bullshit. You can do both, right now, by using a condom every time you have sex. And guess what the strategy will be if you get a positive result? Use a condom every time you have sex. Testing is really helpful for the “be faithful to one partner” prevention strategy, but it’s not necessary for self-protection. As long as condoms are on the table, there’s no benefit from testing in terms of being able to protect oneself from the virus.

This matters, because people have enough problems with perceived agency in protecting themselves from infection already, and because testing can be very expensive, if not financially, then psychologically, emotionally, socially, and in terms of time spent doing it. Few Malawians, for example, seek it out. We’re giving people another reason not to take HIV-preventive action.

On the other side, policymakers generally assume that telling people their test results will translate into people knowing and believing their true status. I’m increasingly dubious. A friend of mine sat in on some result deliveries (I am not sure why this was allowed, and so will not name names) and they* told me they saw 7 people learn their status. 2 tested positive, and calmly accepted the result (at least outwardly). 5 were negative, and most of these were shocked and incredulous: “No – I must have HIV! I’m so sick!”

Doubt in test results is actually right in line with my own initial work on people’s HIV prevention choices in the face of massive perceived infection risk. I’m currently playing with extensions to that paper that explicitly incorporate disbelief in tests.

*Yes, I’m using “they” in the gender-neutral pronoun sense that dates back to the origins of the English language. Take that, historically myopic grammarians!

Assorted observations about Malawi

For aggravating reasons Twitter doesn’t work via SMS here, so I’d have to use a computer to “microblog”. And then it just feels silly. Instead, I’ll just occasionally post lists of stuff that crosses my mind on this blog:

  • Malawian English isn’t big on prepositions or certain adverbs. I’ve noticed the following default phrasings: “Can you pick me?” instead of “Can you pick me up?”*; “Drop me here” instead of “Drop me off here”; “I’m coming” instead of “I’m coming back”. I’m on the alert for more.
  • It is much harder to switch which hand I use my turn signal with than to adjust to driving on the opposite side of the road; my windshield gets cleaned every time I turn. Thank God those lovably inconsistent Brits didn’t take it upon themselves to change which side the clutch is on.
  • Making change is a catastrophe here, just like in every poor country I’ve been to – there’s even a paper about it. Half of my purchases involve a set of back-end transactions across a network of up to 5 people to get the right change. How do they even remember who owes whom? What makes this so different than in the US? I have no answers.
  • You don’t get used to the smell of mosquito coils. It actually gets worse with more time. Didn’t know that could happen.
  • People here are nice. Like crazy nice. If a random foreigner walked up to your house and tried to ask questions, I’ll bet you’d shoo him off the porch. The women here try to give me their only stool to sit on and sit in the dirt, and it’s a struggle to politely turn down. Having my enumerator take it is a little better I guess.

*I’m not sure how the misguided prescriptivists who abhor sentence-ending prepositions would even say this.

Bad coding in social science research: first in a series

A growing share of the work done by quantitative social science involves what is effectively programming. We have to write code to solve models numerically, to assemble datasets, to transform them between data analysis packages, to clean them, and to analyze them. This is a problem – maybe a huge problem – because most of us (and I’m probably not an exception) are pretty bad coders. We weren’t trained to write code. Most of us have never taken a class in it. I’m passingly familiar with some of the best practices of coding from having lived with legitimate programmers for a stretch, but typical code written by social scientists breaks every rule. Need to do something similar ten times? Let’s copy-paste it ten times and edit each line slightly! That’ll work out great.

It’s hard to know how much this has screwed up the results of research, because releasing code is a new idea and still not widespread. But my UM colleague Joe Golden and I often gripe at all the cases we see, and he sometimes emails me egregious ones he comes across. I’m going to start posting all the ones we find here, so I have a quick list to point to of all the issues out there. Maybe if I have time I’ll even dig up all the famous economics papers where the central result was due to either a confirmed or likely coding error.

The latest case (via Joe) comes from the US Census’s anonymization methods for its release of public data. They do this so they allow researchers to use their data without compromising individuals’ privacy, but somebody screwed up. As a result, some of the statistical properties of the data – which are supposed to be preserved by the anonymization process – were changed, potentially invalidating loads of research.

Save the Guinea Worm

As a bit of background, we are on the verge of eradicating Guinea Worm, a horrific parasite that infects humans through drinking water and then exits the body, painfully and slowly, as a multi-foot-long worm-looking thing, usually through the feet (click here for details – warning, graphic image at link).

Eradication, though, means the extinction of the species. It can only live in humans, and causes us great pain and occasional death. Don’t all animals have rights? Who are we to keep this horrible piece of crap from plaguing us until the end of time? Enter the Save the Guinea Worm Foundation. I believe the site is just a brilliant work of satire, but they play the game so well that it’s hard to be totally sure. It’s all very entertaining, or maybe scary. And this comment thread on reddit indicates that in fact some people actually believe in the moral equivalence of parasites and human beings – and strongly enough to potentially oppose disease eradication.

So who’s going to sign up to host the revival of smallpox?

A sea change in attitudes toward gay marriage?

The latest prominent liberal to change his tune is Congressman John Dingell, evidently after pressure from his primary opponent, and my fellow UM econ grad student, Dan Marcin. While some credit is certainly due to Dan, I find this shift remarkably similar to those by the NAACP and prominent black celebrities in the aftermath of President Obama’s public change of heart on the issue. I don’t give Obama, Dingell, or any of these other people the least bit of credit on their views “evolving” – these are clearly calculated political moves. But Obama’s public statement appears to be having large effects on people’s stated views, maybe by making it “safe” for people from social groups hostile to the concept to come out in favor of same-sex marriage. There are echoes here of the vast change in stated opinions on racial segregation after the 1960s.

Via EconJeff.

The under-reported good news out of Africa

Mainstream media outlets typically highlight only the bad news from sub-Saharan Africa. There might be many reasons for this – it fits our expectations, it sells better, whatever. Regardless of the cause, coverage of Africa has broadly missed the most important stories from the continent over the past decade or so: a drop in infant mortality and a rise in economic growth.

This Talk of the Nation piece brings together a number of experts to discuss the reasons behind the latter, including UC Berkeley economist (and co-author of my favorite paper) Ted Miguel. My favorite bit came toward the end, when they were discussing increased traffic congestion as a sign of economic growth, but also as a new obstacle to be overcome. While listening to it, I was stuck in traffic just outside Blantyre, and had time to look out my window and see this (don’t worry, mom, I pulled over to take the picture):

Hat tip: Nancy Marker

The Linear Education Model

To most people, statistics – especially the output of a statistics program like Stata – is just a mess of numbers that don’t mean a whole hell of a lot. This is a big problem, and I believe it emerges from the way statistics (or econometrics, or biostatistics, or whatever a given field’s flavor of stats is known as) is taught, and the way that experts are trained in it.  I have much to say about this broader topic, but today I want to rant about probits and logits. A warning to any non-stats-minded readers: this might get a little technical.

Probits and logits are techniques commonly taught in basic regression analysis courses for handling data with discrete outcomes, e.g. a person’s labor force participation. We used them in ECON 406 when I was teaching it last term, for example. Teaching this way exacerbates all the worst problems with explaining statistics, because instead of the usual table of readily-interpretable numbers you usually get with a linear regression (which tell you the slope of Y with respect to X), the default behavior of Stata and every other stats package I’ve used is to spit out a table of things called “index coefficients”. These have a relationship to the slope we’re interested in, but it’s impossible for regular humans to interpret their magnitude directly and in some cases their sign may even be different. You can force Stata to give you slopes, but it’s not the default behavior because of reasons. So people see a set of misleading numbers and often get fooled; I have seen really smart people misinterpret these.

Despite (or maybe because) probit and logit add a level of inscrutability to regression analysis, many experts really like them. This post by David Giles offers an unconventional defense of probit/logit over linear regression, which in the case of binary outcomes is known as the Linear Probability Model (LPM): that if the outcome is measured with error the parameters of the model are unidentified. Although I must admit I’m not sufficiently skilled in pure econometrics to figure out everything the authors are doing, it turns out that, in the underlying paper by Hausman et al., the authors assume that we know the cdf of the error term to be normal (probit) or logistic (logit). This strikes me as begging the question.

But it’s still odd that they get this result at all since the properties of linear regression are robust to any cdf for the error term. Jorn-Steffen Pischke at Mostly Harmless Econometrics points out that my gut is not wrong: “The structural parameters of a binary choice model, just like the probit index coefficients, are not of particular interest to us. We care about the marginal effects” and the LPM does as good a job approximating them as a non-linear model (A marginal effect is just a slope). This is consistent with my general take on probits and logits, which is that they are better than the LPM only if we happen to know that the true distribution of the error term fits one of those cdfs and also the true model for the index coefficients, which is to say that they are better if we simulate fake data with those properties and in pretty much no other circumstances. My advice to novices running probits and logits is that if the marginal effects differ from the LPM results (or between probit and logit), you had better be pretty confident about that the specific non-linearity you’re imposing is actually true.

Giles’s argument stays away from an even sillier defense of probits and logits, which is that they prevent the model from values of the outcome variable that are more than 1 or less than 0. I’ve always disliked this, because economists and statisticians make a living pretending that discrete variables are continuous. For example, it’s not too uncommon to look at the impact of some intervention on how much schooling a student gets. Schooling is sort of continuous – you could get a fractional year of schooling – but on most surveys, including my own, it’s impossible to report anything but a whole number. It also has definite bounds – less than zero is impossible, and so is more than some top-code on the survey (typically for the end of graduate school). I can run something I call a “linear education model” (LEM) to estimate, but it’s sometimes going to predict negative schooling for some students.  That’s fine – it’s a way of estimating the average change in schooling due to the intervention, not a perfect model of the world. It’s also usually going to return predicted values that are between any two years of schooling, e.g. 11.5. This is also fine – everyone knows what this means, namely that some share of the time a person will end up at any discrete value such that the average is 11.5.

Economists run models like the LEM all the time. Technically we could use a multinomial logit, but nearly everyone agrees that would be overkill. Linear regression works fine. Now imagine we recode schooling into a binary variable, with 0 being “primary or less” and 1 being “secondary or more”. Can we still run a linear regression and learn something of interest? Definitely.

Madrassahs

I spend a lot of time eating and working at Tasty Bites, a restaurant in the center of Zomba Town that has good food, a reliable WiFi hotspot, and a generator so I can still eat there and see things when the power quasi-randomly goes out. One of the upsides is that it gets a lot of traffic from expats and so forth, so being there is a good way to absorb information in a place where posting stuff to the web isn’t yet a given (newspapers are also much more key here than in the US).

The owners are Malawian Indians (South Asians? I can never tell), and while I’m not 100% sure they’re Muslims the food is Halal so they get a fair amount of business from local South Asian Muslims. So one of the interesting folks I met today was a gentleman named Moulana Ismail, who runs a local Madrassah for the Association of Sunni Madrassahs of Malawi and the Institute for Islamic Education and Culture (IIEC). He says they offer free secondary education to about 40,000 Malawians, which strikes me as a big deal in a country with a lot of problems with education. According to their website they also give the students free food, which is awesome.

What’s crazy about this is how normal it all is. “Madrassah” is a word with a nasty reputation in the US, but there are no Muslim extremists in Malawi and while these are religious schools they seem like they’re doing a pretty solid job. I’m not sure I’ll donate to these guys – they’re doing good work, but I’m iffy on their focus on the Koran – but as I contemplate it, I can’t help but think about our insane political culture in the US and the fact that someone might use such a donation against me in the future. I wish these guys good luck, not least because they can help clear up the good name of a word that just means “school”.