## Simpson’s Paradox And The Positive/Negative Effect of Religious Belief

26 Jun

While not necessarily related to Bayes Theorem, something like this has been popping up in my mind whenever I read news stories dealing with statistics so I thought I would make a post about it.

In simplest terms, aggregate data might have different statistical properties than subsets of the aggregate data. As a matter of fact, the aggregate data might show the completely opposite effect when looked at in subsets.

An intuitive example of this is weather. You can average the temperature over the course of the year, or you could find the average of temperature over the course of six months. It might be that temperature over the course of the year has a slightly positive upward slope, yet temperature from June to December has a negative slope.

This seems obvious. But what if you’re dealing with something that’s not so obvious?

The example Wikipedia gives that I think is a non-controversial example is kidney stone treatment. Say you have Treatment A for either large or small kidney stones and Treatment B for large or small kidney stones.

Treatment A is effective on 81 out of 87 (93%) small kidney stones while Treatment B is effective on 87% (234/270) small kidney stones. For large kidney stones, Treatment A is effective 73% (192/263) of the time and Treatment B is effective 69% (55/80) of the time.

Clearly, Treatment A is what you should use for both small and large kidney stones. But what happens when we aggregate over both small and large kidney stones? Treatment A is 81/87 + 192/263 = 273/350 (78%) while Treatment B is 234/270 + 55/80 = 289/350 (83%). Now it turns out that Treatment B is better than Treatment A!

Therein lies Simpson’s Paradox. What happens when we have something controversial? Wikipedia also has the example of apparent sexism in graduate school admissions (which it still seems like no one has tried to account for this paradox when talking about modern controversies like the gender wage gap). But this is mainly a religion blog: So what about whether religion is good or bad for people or society?

Very religious Americans […] have high overall wellbeing, leading healthier lives, and are less likely to have ever been diagnosed with depression… These positive associations between religious engagement and the good life are reverse when comparing more versus less religious places rather than individuals…

Gallup World Poll data from 152 countries [show] a striking negative correlation between these countries’ population percentages declaring that religion is “important in your daily life” and their average life satisfaction score…

Across US states, religious attendance rates predict modestly lower emotional well-being…

Epidemiological studies reveal that religious engagement predicted longer life expectancy…

Across states, religious engagement predicts shorter life expectancy…

Across states religious engagement predicts higher crime rates. But across individuals, it predicts lower crime rates…

If you want to make religion look good, cite individual data. If you want to make it look bad, cite aggregate data…

Stunning individual versus aggregate paradoxes appear in other realms as well. Low-income states and high-income individuals have [recently] voted Republican…

Liberal countries and conservative individuals express greater well-being…

Highly religious states, and less religious individuals, do more Google “sex” searching…

One might wonder if the religiosity-happiness association is mediated by income — which has some association with happiness. But though richer people are happier than poor people, religiously engaged individuals tend to have lower incomes — despite which, they express greater happiness.

This is from a conference paper. I’m not actually sure if this is an example of Simpson’s Paradox, but the larger point remains. Breaking up data along different axes might yield paradoxical results. As the author says, if you want to make religion look bad, cite aggregate data. If you want to make religion look good, cite individual data.

But which statistic should one use? The aggregate data or the individual data? They’re both true, for lack of a better word, so it’s not like one is “lying”. I would tend to lean towards using the aggregate data if forced to choose. But there’s no harm in looking at both. And if both paint the same picture that just means that you have a more complete view of the phenomenon at hand.

Posted by on June 26, 2017 in Bayes, economics/sociology, religion

NeuroLogica Blog

Slate Star Codex

NꙮW WITH MꙮRE MULTIꙮCULAR ꙮ

Κέλσος

Matthew Ferguson Blogs

The Wandering Scientist

Just another WordPress.com site

NT Blog

Euangelion Kata Markon

A blog dedicated to the academic study of the "Gospel According to Mark"

PsyPost

Behavior, cognition and society

PsyBlog

Understand your mind with the science of psychology -

Vridar

Musings on biblical studies, politics, religion, ethics, human nature, tidbits from science

Maximum Entropy

atheist, polyamorous skeptics

Criticism is not uncivil

Say..

Research Digest

Disrupting Dinner Parties

Feminism is for everyone!