Wednesday, September 28, 2011

The Gender Wage Gap and Simpson's Paradox

Women earn less than men on average. According to a 2007 study by the American Association of University Women, the average weekly earnings of women who earned a Bachelor's Degree in 1999-2000 and were employed full time in 2001 were only 80% of the earnings of men in the same cohort. That is a 20% gender pay gap.

The relative pay of men and women looks deeply unfair -- especially given the long history of oppression and disenfranchisement suffered by women. However, appearances are sometimes deceiving, so it would be nice to know exactly what is going on. (In this case, appearances are not deceiving but they are also not very helpful for fixing the system!)

Lots of research has been done on the gender pay gap. Overwhelmingly, the goal of such research is to explain the gap. And explanations (or partial explanations) are diverse. This is a very difficult problem! Researchers have pointed to: job experience, career continuity, weekly hours worked, attitudes toward economic risk,wage bargaining habits, choice of education-type (e.g., college major), sector of the economy, and, of course, gender discrimination (which might be personal, institutional, or cultural).

How to explain the pay gap is still contentious, but here is my view of the literature, for what it's worth.

Some of the pay gap is explained by market forces that are not gender-biased in principle but are gender-biased in practice. All else equal, employers want to employ people who have more experience, more education, and more continuity in their careers. Motherhood (but not fatherhood) tends to work against all of those in practice (though it need not in principle). So, women are, in practice, differentially penalized for having children.

Some of the pay gap is due to implicit and explicit discrimination. For example, mothers face a wage penalty in addition to the hits they take to experience, education, and career continuity. And women also sometimes face hiring discrimination: more on this in a moment.

And finally, some of the pay gap is explained by differences in the education and employment that men and women seek. In fact, education and employment sought--especially the sector of the economy in which one works--seems to have more influence than any of the other causes of the pay gap. As it turns out, women typically earn less economically valuable degrees in college than do their male counterparts and then they work in lower-paying sectors of the economy than do their male counterparts. And so, we have the set-up for an instance of Simpson's Paradox.

Simpson's Paradox

When I first set out to write this blog post -- several weeks ago, after an interesting exchange with Noreen Sugrue, Helga Varden, and some others over dinner -- I was intending to simply use the gender wage gap as an excuse to look at Simpson's Paradox. The issue is too complicated to leave it at Simpson's Paradox, but I still want to get the paradox into the discussion.

So first, the simplest description of the technical problem: Simpson's Paradox occurs when a statistical association between two variables either disappears or reverses when one conditions on some other variable(s). (Sometimes people reserve the label "Simpson's Paradox" for reversals -- e.g., from positive association to negative association.)

Why does Simpson's Paradox matter in general? Well, we want to be able to make informed decisions about what personal actions we should take in order to get the best possible outcomes for ourselves, and also we want to make informed decisions about what public policies we should endorse in order to get the best possible outcomes for our societies. We want to make effective, efficient, and fair interventions. In order to do that, we need to know (or at least approximately know) the causal structure.

Ordinarily, one wants to take statistical associations as an imperfect guide to causal structure. But one problem for inferring the correct causal structure from statistical data is that the associations are not generally stable when we start conditioning on additional variables. Take a classic example.

You have cancer, and you want to go to the better out of two hospitals in your local area. You look at cancer survival rates and see the following:

Survival Rate
Hospital A
810/1000
81%
Hospital B
750/1000
75%

Given just this information, the reasonable thing to do is to choose Hospital A. After all, Hospital A has a 6% better survival rate than Hospital B.

But now suppose that I tell you the two hospitals take difficult cancer cases at different rates. Suppose that the data breaks down as follows when we condition on whether a cancer case was hard or easy.

Survival Rate
Hospital A, Easy
800/950
84%
Hospital A, Hard
10/50
20%
Hospital B, Easy
700/800
87.5%
Hospital B, Hard
50/200
25%

After conditioning on the difficulty of the cancer case, we see that Hospital B has a better survival rate regardless of the kind of cancer! Hospital B's treatment of cancer appears to dominate Hospital A's treatment of cancer.

With the wage gap, the problem is trickier than in the toy hospital example, but the principle is the same. And, we actually see evidence of some Simpson-like behavior. For example, consider the graph below from the 2007 AAUW report, Behind the Pay Gap.


Statistically significant wage differences are in bold. Notice that although overall, women earn significantly less than men, in many occupations, women and men are not statistically different with respect to their pay. The point I want to make here is that if we want to have an efficient, effective policy, we need to know what the wage gap looks like in detail, and we need to know why it is the way it is. Just to illustrate with a simple defective policy, we could imagine the government requiring a uniform pay raise for women across all occupations such that the aggregate pay of women comes out equal to that of men. The problem is that some women would still be worse off than men. (Not to mention that some women would be better off than men.) A better approach would be to target the occupations in which women are practically discriminated against.

In a similar vein, Laurie Morgan observes that the wage gap disappears entirely when one focuses on people with graduate education. And, as I have, Morgan points to the policy implications:
To the extent women earn similar rewards to men for college majors, changing women's distribution on college majors would be expected to produce dramatic gains in pay. This would focus our attention on women's choices of college majors, a line of research and policy interest well under way. If, however, women migrate to lower-paying jobs than similarly trained men in spite of similar educations, this focuses our attention on posteducational and labor market processes, including employer discrimination. (629)
Knowing what causes the pay gap affects how we go about fixing the pay gap. Should our intervention be targeted at education and career choices? Or should it be targeted at discriminatory practices of employers? Or both to similar or different degrees or what?

Throwing a further wrench into the works is the fact that it also makes a difference exactly how one dis-aggregates. So, whereas the AAUW finds that in nearly half of the economic sectors they looked at, women have wages statistically indistinguishable from men, in this beautiful graph from the Bureau of Labor Statistics, we find a sizable gender pay gap in every sector of the economy except construction!

Subtler Discrimination

At first glance, one might think that Simpson-like explanations of the gender pay gap are like market forces explanations in being gender neutral -- at least in principle. However, the case is murky. One wants to know why women make the choices they make, and one wants to know why the economy rewards the professional choices that it does.

Let me illustrate the problem. We have reason to believe that some (how much?) of the selection or filtering of women into less economically rewarding degrees and careers is due to institutional and cultural biases. We can see this in hiring practices. For example, in an experimental investigation of sexual discrimination in hiring in England, researchers used resumes that differed only in the sex (gender?) of the applicant. They found that females were preferentially hired for the "female" occupation of secretary and males were preferentially hired for the "male" occupation of engineer. (For two so-called "mixed" occupations, they found a slight preference for females.)

We have reason to think that discrimination against women is due in part to institutional and cultural prejudices. And we have no reason to doubt that those same mechanisms operate with respect to recruitment and retention of women in characteristically male (and characteristically high-paying) fields, like engineering. Even at the level of choice of college major.

Monday, September 26, 2011

Recently Around the Interwebs

Eric Schwitzgebel has a very nice piece on why metaphysics is so bizarre. My money is on option three. To any scientists reading this blog: I would like to get your reaction to Schwitzgebel's piece.

Jeffrey Pierce has a piece at Real Climate about the recent CLOUD results and their relationship to our current understanding of the physical mechanisms relating cosmic rays, cloud formation, and global temperature.

And this is just really cool.

Better blogging soon, I hope.

Saturday, September 17, 2011

Master Chef

Have any of you seen this show, Master Chef? It's a cooking-contest, "reality" television show in which amateur cooks compete for a $250,000 prize. Really, the show is pretty interesting: especially if you like cooking shows.

Anyway, one of the main features of the show is that in each episode, three things happen: (1) there is a cook-off in which two people are picked as having the best dishes; (2) those people become captains of two teams that do some ridiculous food task, like feed a hundred hungry bikers; and then (3) the losing team faces another cook-off in which the person who prepares the worst dish (as judged by the three professional chefs on the show) is eliminated.

That got me thinking about strategy for picking teams. In every episode I've seen, the captains pick the strongest cooks, as you would expect people to do who are interested in winning the team challenge. But I wonder if that is actually the best strategy. So, here is an alternative strategy that I proposed to my wife yesterday: as long as the field is large (say, greater than ten), pick the weakest cooks.

Why would anyone want to do that? Let me explain. The team challenge ends in one of two ways. Either you win and do not face elimination or you lose and have to prepare a dish that is better than at least one dish prepared by your team-mates. If you pick weak cooks and win, then great news: not only are you safe from elimination, but a stronger cook is guaranteed to be eliminated. On the other hand, if you lose, your chances of surviving the elimination test are good, since all of the other members of your team are weak cooks.

Suppose that your chance of winning a team competition by selecting strong cooks is a coin-flip: 0.5. Further suppose that your chance of surviving an elimination test after selecting strong cooks is 1 - (1/2)n, for n team members. How much would the chances of (a) winning by selecting weak cooks and (b) surviving an elimination test after selecting weak cooks have to change in order to make my alternative a good idea?

Okay, for concreteness, suppose there are ten people -- five for each team. So, on the assumptions, you survive to the next round with (approximately) probability 0.5 + 0.5*(1 - 0.55) = 0.985.

Suppose that by selecting all weak cooks, you reduce the chance of avoiding the elimination challenge to 0.3 but you also reduce the chance that you produce a dish worse than an arbitrary dish prepared by your cohorts to 0.4. Then you survive to the next round with (approximate) probability 0.3 + 0.7*(1 - 0.45) = 0.993. Okay, that looks sort of promising, even if the numbers are pulled out of the air. I haven't plotted any curves for this, but just playing around a bit, it looks like you can trade off pretty sizable drops in the chance that your team wins for pretty small drops in the chance that your dish will be better than an arbitrary member of your team and still come out ahead with my strategy. (For example, at 0.3 chance of winning the team challenge, you only need to lower your chance of losing to an arbitrary dish from 0.5 to 0.46 in order to be better off using my strategy.)

I would love to see this done empirically enough times to get some real numbers instead of just guesses!

I would also love to see a more adequate game-theoretic treatment of this problem. Any takers?

Thursday, September 15, 2011

Trust in Experts

Sean Carroll has a very interesting piece on trusting experts. One thing I find surprising in the discussion is how little weight (zero!) Sean wants to accord experts:
I am a strong believer that good reasons, arguments, and evidence are what matter, not credentials. So the short answer to “when should we trust an expert simply because they are an expert?” is “never.” We should always ask for reasons before we place trust.
I do not want to deny that there is something right in this attitude. We should request evidence for claims that people make, even if they are experts. But what if we don't have time or energy or competence to understand the evidence or how it relates to the claim being evaluated? It seems reasonable to me to trust the experts. Certainly, more reasonable than trusting a non-expert. So, as a short, first answer, it seems to me that we should trust experts when we have too little time, energy, and/or competence to evaluate the evidence for ourselves. (A more complicated answer would probably say something about the public benefits of specialization.)

So, there is my answer to the question Sean raises in the quoted passage, but I worry that in the rest of his post, there are two questions apt to be confused. First, there is a question about how reliable experts are: How likely is a given expert or collection of experts to give the correct answer (or an approximately correct answer) to a question in his or her area of expertise? If an expert is reliable, when the expert says that something is the case, we have evidence that that thing really is the case. (Of course, that's true of reliable non-experts as well, but I suppose that what makes someone an expert is that they are more reliable than non-experts. Maybe that is the nut of the social versus natural sciences debate that comes out in Sean's post.)

If you like Bayesian-ish representations of these sorts of problems, you might represent the first question as asking whether P( p | T(S, p)) > P( p), where "T(S, p)" denotes that S testifies that p.

Alternatively, we might compare conditioning on expert testimony to conditioning on non-expert testimony. In both cases, expert opinion will have evidential value. That is, knowing what the expert says will be better than (a) knowing nothing else and (b) knowing what a random non-expert says.

Second, there is a question about whether the expert opinion gives a boost to some claim beyond the evidence itself: If an expert says that p and also says that his or her evidence for p consists in r, s, etc., do (or should) the reasons screen off the expert's statement from the claim?

Now we are asking something different. We want to know whether we have (or should have):

P( p | T(S, p), r, s, ...) = P( p | r, s, ...).

I know that philosophers have spent a good deal of time thinking and writing about the (social) epistemology of testimony. But I don't really know the literature aside from Jon Earman's very enjoyable book, Hume's Abject Failure. If anyone wants to point me to some readings, please do!

Sunday, September 11, 2011

Not Completely Irrelevant

September 11, 2001 was a tragedy. Many (though not all) U.S. foreign policy decisions following the attacks were also tragedies. The decision to allow the use of "enhanced interrogation" (aka, torture) is one such example. Today, whenever I think about September 11, my first thought is not that my country was profoundly wronged but that we did something dreadful to ourselves: we sold off part of our national soul for a little security. It is a difficult causal question, but I am convinced that we would not have struck such a bargain with the devil had it not been for the attacks.

Anyway, reflecting on our endorsement of torture following the events of September 11, 2001 reminded me of an excellent Daily Show interview, wherein Jon Stewart argues with Marc Thiessen about the causal efficacy of torture. You can watch the extended interview (in three parts) here: Part 1, Part 2, and Part 3.

One of the (many) things I like about the interview is that it shows how seemingly arcane questions of only philosophical interest -- like whether or not counterfactuals are backtracking -- apply to meaningful policy debates. Another thing I like is that depending on how the causal question is settled, the moral question might turn out to be irrelevant. That is, if torture is ineffective at producing actionable intelligence (and it probably is), then it doesn't matter whether there are apparently clever rationalizations of torture.

Thursday, September 1, 2011

The Prime Directive is Stupid

I'm probably going to regret posting this: I'm not an ethicist, and I recognize that what I'm saying here is neither new nor careful. But in the interest of getting back into semi-regular blogging ...

This week, I've been spending a lot of my down-time just kicking back and watching episodes of Star Trek: Enterprise, which I didn't watch much of when it originally aired. Except for its super-annoying theme music, the first two seasons of Enterprise are actually pretty good. So today, I watched the episode Cogenitor, in which the Enterprise encounters the Vissians, an alien race that has three genders: male, female, and cogenitor. The cogenitors make up a very small minority of the people and are necessary (but not sufficient) for reproduction. Sadly, the cogenitors are also severely oppressed: they are kept illiterate, they have no jobs -- other than the facilitate reproduction, they have no ownership rights, they have no role in family life or child-raising, etc.

Recognizing that this situation is morally intolerable, Commander Trip, the Enterprise's chief engineer, teaches a cogenitor how to read and encourages it to develop and pursue its own interests. Needless to say, that causes a lot of friction with the Vissians. The cogenitor requests asylum, but Captain Archer refuses. In the end, the cogenitor commits suicide, and Archer lectures Trip about not interferring in the internal affairs of other cultures.

Now, Star Trek: Enterprise is a prequel to all of the other Star Trek shows, and one thing that the show does over and over again is to make the point (as subtly as a soccer-stadium streaker) that Starfleet needs some sort of "directive" for starship personnel so that they can avoid disastrous first-contact scenarios, like this one. That "directive," of course, goes on to be the Prime Directive of the Federation: Federation personnel are not to interfere with the internal development of any alien species. In the episode "Cogenitor," Archer gets as close to hypocritical, Janeway-esque pontificating about the Prime Directive as is possible without actually having such a directive on the books.

I really like Star Trek, but the Prime Directive hacks me off. A lot. Read strictly, the Prime Directive would require that Starfleet have no contact with alien cultures. Even a loose reading would seriously hamstring any actual government and lead to all kinds of moral hazards and conflicts of interest. But worse than that, I think, is the celebration of a pernicious kind of moral relativism -- a kind of relativism that says, "Since these people have a different culture from me, I am not allowed to make any moral judgments about what they do."

No, sometimes an action is morally wrong, regardless of its cultural context. As the Tick once said, "Eating kittens is just plain wrong, and no one should do it -- ever!"

Furthermore, note that the Prime Directive embodies a special variety of moral relativism, which adds in a kind of bizarre super-tolerance. In general, moral relativism is compatible with an intolerant attitude toward the moral commitments endorsed by others. (Similarly for other kinds of moral anti-realism, like non-cognitivism.) Now, generally, I favor tolerance, especially where the opposing moral commitments do not endanger other people. But in some cases, tolerance is not appropriate.