Wednesday, May 2, 2012

Simpson's Paradox

Note:  This is NOT Homer Simpson's paradox - see next post

Simpson's paradox occurs when a correlation present in different groups is reversed when the groups are combined.  See Wikipedia for definition and some examples.  It is often unrecognized by those unfamiliar with statistics and uncomfortable with numbers, becoming a real problem when data subject to this paradox is given a causal interpretation.

In the political context it most recently appeared in 2011 when the combination of the union bargaining legislation, proposed by Gov. Walker and passed by the Wisconsin legislature and Texas Gov. Perry's inept presidential bid placed both states in the political spotlight. In both cases, it appeared as attacks on the performance of Texas, in one case economic and in the other, educational.



The economic claim was that despite the relative prosperity of the Texas economy that average wages were below the U.S. average - Robert Reich, former Labor Secretary under President Clinton, was one of those making this claim.  And, in fact, that claim is correct.  The median hourly wage in Texas is $11.20 compared to the national median of $12.50.

However, it is also true that when you break down by demographic composition, median hourly wages for Hispanics, Blacks and Whites in Texas as groups are higher than in the rest of the United States.  Here's the data from Current Population Survey of the Bureau of Labor Statistics.

Hispanic Texas           $12.60       Black Texas          $15.10       White Texas          $17.10
Hispanic Rest of US   $12.10       Black Rest of US  $14.30       White Rest of US  $16.50

How can both of these assertions be correct?  Simply because Texas has a more diverse population than the rest of the United States.  Texas is 38% Hispanic and 45% White compared to the national average of 16% and 64%, respectively.  (note:  I've found varying data on the exact percentages for Texas but all differ significantly from the national average.)

The education claims lead to a modified version of Simpson's paradox because they are indirectly reversed.
The same article by Reich, as well as columns by Paul Krugman and others, asserted in the context of  Gov. Walker's attempts to change public union collective bargaining rights that states such as Wisconsin, which had strong union rights, performed better than states like Texas which did not have strong union rights.  This was most often raised in the context of comparative high school graduation rates, ACT/SAT scores and per student spending.  Again, it is factually correct that ACT/SAT test scores and graduation rates are higher in Wisconsin than in Texas.

Again, however, if you look at the data from the National Assessment of Educational Progress (NAEP), which administers the annual standardized test given to 4th and 8th graders to measure math, science and reading that data seems to implicitly contradict the other statements (note: NAEP does not test after 8th grade).  In every case (with the exception of Hispanics for 4th Grade Science), Black, Hispanic and White students score higher than Wisconsin students of the same demographic.  That's in 17 out of 18 comparisons.  In other words, Black students in Texas outperform Black students in Wisconsin in math, science and reading in both 4th and 8th grades.  And in all 18 categories, Texas students score above the national average.  Moreover, the gap between White and Hispanic/Black scores was less in Texas than in Wisconsin.

For a more detailed discussion, including links to source data and, in the case of the educational claims, a very thorough discussion of which data should be used for comparative purposes see:

Super-economy
Iowahawk and Iowahawk again

1 comment:

  1. Very instructive. Might there not be a relation between unionization levels and the stability of white employment in Wisconsin? (Not that I am promoting that over black or hispanice employment, lest anyone input racist intentions to my comment. Just meant it as a question of fact.)

    ReplyDelete