Ceteris Non Paribus

Ceteris Non Paribus is my personal blog, formerly hosted at nonparibus.wordpress.com and now found here. This blog is a place for me to put the ideas I have, and the stuff I come across, that I’ve managed to convince myself other people would be interested in seeing. See the About page for more on the reasons why I maintain a blog and the origin of the blog’s name.

My most recent posts can be found below, and a list of my most popular posts (based on recent views) is on the right.

Ceteris Non Paribus

The Linear Education Model

To most people, statistics – especially the output of a statistics program like Stata – is just a mess of numbers that don’t mean a whole hell of a lot. This is a big problem, and I believe it emerges from the way statistics (or econometrics, or biostatistics, or whatever a given field’s flavor of stats is known as) is taught, and the way that experts are trained in it.  I have much to say about this broader topic, but today I want to rant about probits and logits. A warning to any non-stats-minded readers: this might get a little technical.

Probits and logits are techniques commonly taught in basic regression analysis courses for handling data with discrete outcomes, e.g. a person’s labor force participation. We used them in ECON 406 when I was teaching it last term, for example. Teaching this way exacerbates all the worst problems with explaining statistics, because instead of the usual table of readily-interpretable numbers you usually get with a linear regression (which tell you the slope of Y with respect to X), the default behavior of Stata and every other stats package I’ve used is to spit out a table of things called “index coefficients”. These have a relationship to the slope we’re interested in, but it’s impossible for regular humans to interpret their magnitude directly and in some cases their sign may even be different. You can force Stata to give you slopes, but it’s not the default behavior because of reasons. So people see a set of misleading numbers and often get fooled; I have seen really smart people misinterpret these.

Despite (or maybe because) probit and logit add a level of inscrutability to regression analysis, many experts really like them. This post by David Giles offers an unconventional defense of probit/logit over linear regression, which in the case of binary outcomes is known as the Linear Probability Model (LPM): that if the outcome is measured with error the parameters of the model are unidentified. Although I must admit I’m not sufficiently skilled in pure econometrics to figure out everything the authors are doing, it turns out that, in the underlying paper by Hausman et al., the authors assume that we know the cdf of the error term to be normal (probit) or logistic (logit). This strikes me as begging the question.

But it’s still odd that they get this result at all since the properties of linear regression are robust to any cdf for the error term. Jorn-Steffen Pischke at Mostly Harmless Econometrics points out that my gut is not wrong: “The structural parameters of a binary choice model, just like the probit index coefficients, are not of particular interest to us. We care about the marginal effects” and the LPM does as good a job approximating them as a non-linear model (A marginal effect is just a slope). This is consistent with my general take on probits and logits, which is that they are better than the LPM only if we happen to know that the true distribution of the error term fits one of those cdfs and also the true model for the index coefficients, which is to say that they are better if we simulate fake data with those properties and in pretty much no other circumstances. My advice to novices running probits and logits is that if the marginal effects differ from the LPM results (or between probit and logit), you had better be pretty confident about that the specific non-linearity you’re imposing is actually true.

Giles’s argument stays away from an even sillier defense of probits and logits, which is that they prevent the model from values of the outcome variable that are more than 1 or less than 0. I’ve always disliked this, because economists and statisticians make a living pretending that discrete variables are continuous. For example, it’s not too uncommon to look at the impact of some intervention on how much schooling a student gets. Schooling is sort of continuous – you could get a fractional year of schooling – but on most surveys, including my own, it’s impossible to report anything but a whole number. It also has definite bounds – less than zero is impossible, and so is more than some top-code on the survey (typically for the end of graduate school). I can run something I call a “linear education model” (LEM) to estimate, but it’s sometimes going to predict negative schooling for some students.  That’s fine – it’s a way of estimating the average change in schooling due to the intervention, not a perfect model of the world. It’s also usually going to return predicted values that are between any two years of schooling, e.g. 11.5. This is also fine – everyone knows what this means, namely that some share of the time a person will end up at any discrete value such that the average is 11.5.

Economists run models like the LEM all the time. Technically we could use a multinomial logit, but nearly everyone agrees that would be overkill. Linear regression works fine. Now imagine we recode schooling into a binary variable, with 0 being “primary or less” and 1 being “secondary or more”. Can we still run a linear regression and learn something of interest? Definitely.

Madrassahs

I spend a lot of time eating and working at Tasty Bites, a restaurant in the center of Zomba Town that has good food, a reliable WiFi hotspot, and a generator so I can still eat there and see things when the power quasi-randomly goes out. One of the upsides is that it gets a lot of traffic from expats and so forth, so being there is a good way to absorb information in a place where posting stuff to the web isn’t yet a given (newspapers are also much more key here than in the US).

The owners are Malawian Indians (South Asians? I can never tell), and while I’m not 100% sure they’re Muslims the food is Halal so they get a fair amount of business from local South Asian Muslims. So one of the interesting folks I met today was a gentleman named Moulana Ismail, who runs a local Madrassah for the Association of Sunni Madrassahs of Malawi and the Institute for Islamic Education and Culture (IIEC). He says they offer free secondary education to about 40,000 Malawians, which strikes me as a big deal in a country with a lot of problems with education. According to their website they also give the students free food, which is awesome.

What’s crazy about this is how normal it all is. “Madrassah” is a word with a nasty reputation in the US, but there are no Muslim extremists in Malawi and while these are religious schools they seem like they’re doing a pretty solid job. I’m not sure I’ll donate to these guys – they’re doing good work, but I’m iffy on their focus on the Koran – but as I contemplate it, I can’t help but think about our insane political culture in the US and the fact that someone might use such a donation against me in the future. I wish these guys good luck, not least because they can help clear up the good name of a word that just means “school”.

Research density

I’m about to go drive around Zomba District trying to nail down the local area where I’ll run my project. This is going to involve lots of challenges, many which you might expect: low-quality roads, multiple places with the same name, limited geographic data (I’ve got a nice map from the pros up at Malawi’s NSO, but even they have their limits), language barriers (I’m bringing an RA with me to help with that) and so forth. But perhaps the largest problem I’ll face is finding somewhere to do research where there aren’t already too many other past or current research projects going on, or too many NGOs or international organizations in the field. Malawi has a crazy density of research going on, and no small number of other development projects as well. To pick just the two most prominent examples here in Zomba District, the local area where I did my project last summer is home to one of the Millennium Villages, and some of my friends here in Zomba Town are working on the fairly massive Zomba Cash Transfers project, which is doing a long-run followup to see whether cash payments to girls in secondary school have beneficial effects, years later, on their kids.

All this is really cool, of course, but can pose problems for my research – I don’t want a sample that’s too saturated with other projects, since I want my results to be as generalizable as possible. Someday I’d like to make or find a world map showing the locations of development research, since anecdotally the patterns are very striking. I’d guess there are 20 studies in Kenya for every 1 in the Congo (DR) despite the latter having nearly double the population of the former.

The worst thing that could possibly have happened to particle physics

Last week scientists at CERN sort-of-confirmed that they had discovered the Higgs Boson. The Higgs has been the holy grail of particle physics for some time, and the LHC was widely expected to discover it, albeit maybe a bit quicker than actually happened. Back in my days as an undergraduate physics major I remember a friend wearing t-shirt about the LHC that said “Higgs Inside” in the style of the old Intel logo (which might be the nerdiest shirt ever). After multiple shutdowns and delays, the LHC did indeed find the Higgs Boson, and my understanding is that it was more or less right where we expected it. This is being widely celebrated in the media, incorrectly in my view. While it is a triumph for the Standard Model, the fact that the Higgs is right where we expected it is a catastrophe for particle physics. As The Economist puts it,

Unlike the structure of DNA, which came as a surprise, the Higgs is a long-expected guest. It was predicted in 1964 by Peter Higgs, a British physicist who was trying to fix a niggle in quantum theory, and independently, in various guises, by five other researchers. And if the Higgs—or something similar—did not exist, then a lot of what physicists think they know about the universe would be wrong.

That last bit is crucial; every major advance in physics for over a century has been driven by surprises, from the non-existence of the ether to the apparent failure of conservation of energy in beta decay. Surprise findings and failures of our existing models force us to revise our theories. That’s especially important now, since the next really big frontier for particle physics is string theory. The anecdotal consensus is that testing that it will require a particle collider the size of the solar system to test any of the various predictions from all the various string theories out there. If we don’t see any surprises soon, we might have a very long wait.

Some of this may be a case of subliminal sour grapes – I decided to quit physics a while ago, and so am obviously predisposed to find reasons to convince myself that that was a good thing. But think about it this way: finding the Higgs right where we expect it was a big deal, right? We’re excited? Imagine how exciting it must have been to learn that energy and mass are fundamentally equivalent. There’s just no comparison with past discoveries. In fact, this is nothing close to the excitement stirred up by the (apparent but mistaken) finding that neutrinos were going faster than light. That would have been thrilling. The Higgs is much less so.

[Edited because for some reason The Economist is trying to paste in the whole story I grabbed that quote from]

 

Kill all the mosquitoes

The idea that we should just kill every mosquito on the planet often crosses my mind as I sit in my room fending hordes of them off my ankles. And with good reason; they kill more people than any other animal, and probably more than all other animal species combined.

I’ve even issued a challenge to various friends and passersby unfortunate enough to have to listen to my rant about this: I’d like the world’s top ecologists to defend the importance of mosquitoes. Marshall all the best evidence, and make the most compelling case for their continued existence. This isn’t something natural resources and ecology types normally do – discussions of ecological changes tend toward an emphasis on the unbreakable circle of life and the possibility that losing one species of moth someplace will lead to the demise of the human race. But a recent article from Nature does exactly the kind of analysis I wanted. And the verdict is that, well, mosquitoes can all go to hell:

Ultimately, there seem to be few things that mosquitoes do that other organisms can’t do just as well — except perhaps for one. They are lethally efficient at sucking blood from one individual and mainlining it into another, providing an ideal route for the spread of pathogenic microbes.

That last bit is false, of course – mosquitoes don’t inject previous targets’ blood into their current host, which is why they don’t spread HIV. But the overall sentiment is one I can get behind.

I can’t resist criticizing this line, though: “few scientists would suggest that the costs of an increased human population would outweigh the benefits of a healthier one.” The idea that an increased human population is a ridiculous fallacy. Let’s frame it a different way: should we be happy with the benefit of a decreased human population due to mosquitoes? That is, should we treat higher mortality, especially more babies in poor countries dying, as a pro? I’d like to see less of this line of reasoning from people talking about demographic trends; it’s somewhere between uninformed and offensive.

Hat tip: topnaman

Don't turn down nsima

Last summer, while scouting the local area near a village where we were collecting data, I made the mistake of turning down an offer of nsima. A lady saw me walking around and said “we have nsima”. I’m not sure what I said – my thoughts were along the lines of “that’s cool, good for you” – and I left. Bad idea. I subsequently learned that “we have nsima” is an offer to ask for some, and you don’t turn it down. I usually just take a little piece. This is generally pretty amusing, partly because Malawians tend to find it funny when white folks eat nsima, but mainly because my nsima-consumption skills are on the same level as my grandmother using chopsticks. You’re supposed to make a little ball and I’m useless at that so it just goes right in my mouth.

I’ve gotten such offers twice in the last two days, and they aren’t always from nice ladies who are in the middle of cooking and have extra. Last night the guy who I buy water from at the local gas station offered me some of his dinner. After an attempt to politely decline – it was his dinner, after all – I took a bit. Another employee who was looking on told me that white people usually turn down nsima, and he doesn’t like that. This is a little incongruous, given that white people eating nsima is an amusing sight, but it makes a certain kind of sense; we’ve built up a negative reputation in that area.

This is now my second consecutive post on little tidbits of Malawian culture. I’m hoping some future Jason will be able to find these on Google, since I’ve often tried to look this stuff up in vain.

"Aise!"

The spectre of colonialism hangs over every interaction between Malawians and white people. Direct rule is long over, but the cultural institutions and power dynamic are still there, even 48 years after the British left. I’m always aware of this, but usually it’s just a tiny nagging at the back of my mind.

One thing that brings it immediately to the forefront is trying to get someone’s attention. The only sure way to let them know you want them is to shout “Aise!” (or, equivalently, “Ayise!”).* That is, a Chichewa-ified version of “I say!” Every time I hear it I imagine a sunburnt Englishman with a mustache and a safari hat yelling for a servant to bring him a cocktail made with gin. So I pretty uniformly refuse to say it. This commonly prompts any Malawians near me at, for example, the bar, to shout it for me. Which is fine by me – it seems much more uncomfortable and colonial coming from a white guy.

*I have attempted to use “Yo” with limited success.

If I posted every great SMBC strip here then this blog would just be a backup of that comic’s RSS feed. But this one is particularly trenchant.

We usually think of effective governments as limiting violence by citizens through the exercise of power. Consider the alternative: maybe some of our success in reducing violence is due to giving people who are naturally inclined to be thugs a legitimate outlet for doing so, through state-sanctioned military and law-enforcement activities that don’t “count” as violence.

Stop marginalizing gays, for your own sake

This piece from NPR about the challenge of targeting gay men in Kenya for HIV prevention messaging (NB: at least while in Malawi I typically just read the columns on NPR stories since getting the audio to load is pretty much a pipe dream on my current internet connection).

The thing about MSM (men who have sex with men, which is the academic term for gays since the latter is an identity that many do not share) in Africa is that we have no idea how common the practice is because it’s heavily stigmatized and often illegal. That matters a lot – in the US, where MSM is relatively common (something like 7% of men have ever had sexual contact with another man), they are important for the HIV epidemic. In Africa we don’t know because we can’t measure this stuff accurately.

In the article they say: “But the rate among men who have gay sex is more than three times the national average.” There’s no way they know that. Most studies I’ve seen of MSM in Africa use the “snowball” approach, which means that the researcher approached a set of gay friends, had them contact their friends, and so forth. This is useful for finding marginalized populations, but is terrible if what you want is a representative sample. The reason for doing it is that any technique that gives you a probability sample of the population will almost surely cause people to lie about having same-sex sexual contact. This article even admits that most MSM in Kenya keep their same-sex partners a secret. Lacking statistics on MSM that are representative of the population means we can’t get meaningful measures of the prevalence of the practice or of the relative HIV risk for MSM. We also can’t study their importance in driving the heterosexual epidemic (which is an issue since many MSM have female partners as well). And as this article notes we also have trouble identifying them for interventions. All of these factors tend to make the HIV epidemic worse, even for women and men who are exclusively heterosexual.

Discriminating against men who have sex with men isn’t just morally reprehensible – it has practical negative consequences for everybody else.

Hat tip: Nancy Marker

A reprieve!

On Sunday night a shipment of Fanta Passion arrived in Zomba. I saw some of the first crates being rolled out – at Domino’s, I knew they had it in stock before the staff did:

One of my Malawian friends tells me that they actually make it locally, at Southern Bottlers (which may now be Carlsberg, because here in Malawi everything is made by Carlsberg). This is great news, as it definitively expands the number of places where I can get Passion Fruit-flavored soda to two. He theorizes that it is in fact low demand that leads to limited production – suppliers cut back because people weren’t buying enough of it. To test that theory, I’ll observe how quickly this shipment dries up.

As a bonus, I found a bakery that makes soft-serve ice milk, which is like soft-serve ice cream only way better. More liquid, less gelatinous. The only other place I’ve found it is Pic-Nic-Fry on the Southern Californian island of Santa Catalina.