RSS

Category Archives: Bayes

Probability Only Exists In Your Head

I’ve already written about this before but I’ve thought of another way of explaining this.

As I wrote in that post that I linked to above, probabilities aren’t facts about objects or phenomena that we look at or experience. If you flip a coin and it lands heads twice, the probability of it landing tails on the third flip is the same as the probability of it landing heads on that third flip.

But people who think that probability is an aspect of the coin similar to its weight or its color will think that 50% probability is physically tied to the coin, so it *must* account for the lack of landing tails on the next flip. As though there is a god of coin flips who has to make sure that the books are accounted for.

Again, this is wrong. And this next scenario I think explains why.

In a standard deck of cards, there’s a 1/52 chance of pulling any specific card, right? What if we have two people, Alice and Bob, who want to pull from the deck. Except, Alice has memorized the order of the cards in the deck and Bob hasn’t.

What is the probability of Bob drawing an Ace of Spades on the first draw? For us and Bob, it’s 1/52. But for Alice — because she’s memorized the order of the cards — it’s virtually certain (e.g., 99.99% or 0.000…1%) to her which card Bob will draw.

If 1/52 was some intrinsic aspect of the deck of cards, then how can there be two different probabilities? Obviously, because probability is a description of our uncertainty. It only exists in our minds. The reader of that thought experiment and Bob are operating under uncertainty. Alice, on the other hand, is not because she’s memorized the order of the cards.

Furthermore, Bayes is all about updating on new evidence. What if there was some third actor, Chad, who mixed up the deck of cards outside of Alice’s knowledge? Now, Alice may think that the next card’s probability is either 100% or 0%, but this is not true either. Now Chad has the certainty.

If Bob draws a card that Alice doesn’t think he should draw, how can she possibly do a Bayesian update on either 0% or 100%? She has to do the equivalent of moving faster than the speed of light in order to update; it literally takes infinite bits of data in order to update from 0% or 100% to some other number. Try it:

P(H | E) = P(E | H) * P(H) / P(E)
50% = ??? * 0% / 1.9% or

50% = ??? * 100% / 1.9%

This situation can be repeated over and over again, introducing new characters manipulating the deck outside of other people’s knowledge. And this demonstrates that not only is probability subjective and in your head, but that a Bayesian probability of 0% or 100% is not a probability at all because those numbers cannot be updated.

Advertisements
 
Leave a comment

Posted by on September 14, 2017 in Bayes

 

Simpson’s Paradox And The Positive/Negative Effect of Religious Belief

While not necessarily related to Bayes Theorem, something like this has been popping up in my mind whenever I read news stories dealing with statistics so I thought I would make a post about it.

In simplest terms, aggregate data might have different statistical properties than subsets of the aggregate data. As a matter of fact, the aggregate data might show the completely opposite effect when looked at in subsets.

An intuitive example of this is weather. You can average the temperature over the course of the year, or you could find the average of temperature over the course of six months. It might be that temperature over the course of the year has a slightly positive upward slope, yet temperature from June to December has a negative slope.

This seems obvious. But what if you’re dealing with something that’s not so obvious?

The example Wikipedia gives that I think is a non-controversial example is kidney stone treatment. Say you have Treatment A for either large or small kidney stones and Treatment B for large or small kidney stones.

Treatment A is effective on 81 out of 87 (93%) small kidney stones while Treatment B is effective on 87% (234/270) small kidney stones. For large kidney stones, Treatment A is effective 73% (192/263) of the time and Treatment B is effective 69% (55/80) of the time.

Clearly, Treatment A is what you should use for both small and large kidney stones. But what happens when we aggregate over both small and large kidney stones? Treatment A is 81/87 + 192/263 = 273/350 (78%) while Treatment B is 234/270 + 55/80 = 289/350 (83%). Now it turns out that Treatment B is better than Treatment A!

Therein lies Simpson’s Paradox. What happens when we have something controversial? Wikipedia also has the example of apparent sexism in graduate school admissions (which it still seems like no one has tried to account for this paradox when talking about modern controversies like the gender wage gap). But this is mainly a religion blog: So what about whether religion is good or bad for people or society?

Very religious Americans […] have high overall wellbeing, leading healthier lives, and are less likely to have ever been diagnosed with depression… These positive associations between religious engagement and the good life are reverse when comparing more versus less religious places rather than individuals…

Gallup World Poll data from 152 countries [show] a striking negative correlation between these countries’ population percentages declaring that religion is “important in your daily life” and their average life satisfaction score…

Across US states, religious attendance rates predict modestly lower emotional well-being…

Epidemiological studies reveal that religious engagement predicted longer life expectancy…

Across states, religious engagement predicts shorter life expectancy…

Across states religious engagement predicts higher crime rates. But across individuals, it predicts lower crime rates…

If you want to make religion look good, cite individual data. If you want to make it look bad, cite aggregate data…

Stunning individual versus aggregate paradoxes appear in other realms as well. Low-income states and high-income individuals have [recently] voted Republican…

Liberal countries and conservative individuals express greater well-being…

Highly religious states, and less religious individuals, do more Google “sex” searching…

One might wonder if the religiosity-happiness association is mediated by income — which has some association with happiness. But though richer people are happier than poor people, religiously engaged individuals tend to have lower incomes — despite which, they express greater happiness.

This is from a conference paper. I’m not actually sure if this is an example of Simpson’s Paradox, but the larger point remains. Breaking up data along different axes might yield paradoxical results. As the author says, if you want to make religion look bad, cite aggregate data. If you want to make religion look good, cite individual data.

But which statistic should one use? The aggregate data or the individual data? They’re both true, for lack of a better word, so it’s not like one is “lying”. I would tend to lean towards using the aggregate data if forced to choose. But there’s no harm in looking at both. And if both paint the same picture that just means that you have a more complete view of the phenomenon at hand.

 
Leave a comment

Posted by on June 26, 2017 in Bayes, economics/sociology, religion

 

Probability: The Logic of the Law

bayes theorem

While poking around on JSTOR (thanks, grad school!) I found an interesting article in the Oxford Journal of Legal Studies called “Probability – The Logic of the Law“. In it, Bernard Robertson and G. A. Vignaux argue that probability is, you guessed it, the logic behind legal analysis and arbitration.

So not only do we have arguments in favor of probability being the logic of science (Jaynes), probability being the logic of historical analysis (Tucker, Carrier), but we now have an argument that probability is the logic of the legal world, too.

Here’s how Robertson and Vignaux derive Bayes Theorem in the article:

It has been argued that the axioms of probability do not apply in court cases, or that court cases out not to be thought about in this way even if they do apply. Alternatively, it is argued that some special kind of probability applies in legal cases, with its own axioms and rules… with the result that conventional probability has become known in the jurisprudential word Pascalian… In practice one commonly finds statements such as:

The concept of ‘probability’ in the legal sense is certainly different from the mathematical concept; indeed, it is rare to find a situation in which these two usages co-exist, although when they do, the mathematical probability has to be taken into assessment of probability in the legal sense and given its appropriate weight

This paper aims to show that this view is based upon a series of false assumptions.

The authors then go into some detail about common objections to the “mathematical” view of probability and why people think it doesn’t apply to the law:

1. Things either Happen or They Don’t; They Don’t Probably Happen

An example of this argument is provided by Jaffee: ‘Propositions are true or false; they are not “probable”

2. A Court is Concerned not with Long Runs but with Single Instances

Thus, descriptively:

Trials do not typically involve matters analogous to flipping coins. They involve unique events, and thus there is no relative frequency [my emphasis] to measure

And normatively:

Application of substantive legal principles relies on, and due process considerations require, that triers must make individualistic judgements about how they think a particular event (or series of events) occurred

3. Frequency Approaches Hide Causes and Other Relevant Information which Should Be Investigated

For an extended example of this argument see Ligertwood Australian Evidence (p14)

4. Evidence Must Be Interpreted

The implicit conception [in the probability debate] of ‘evidence’ is that which is plopped down on the factfinder at trial… the evidence must bear its own inferences… each bit of evidence manifests explicitly its characteristics. This assumption is false. Evidence takes on meaning for trials only through the process of being considered by a human being… the underlying experiences of each deliberator become part of the process, yet the probability debates proceed as though this were not so

5. People Actually Compare Hypotheses

Meaning is assigned to trial evidence through the incorporation of that evidence into one or more plausible stories which describe ‘what happened’ during events testified to at trial …The level of acceptance will be determined by the coverage, coherence and uniqueness of the ‘best’ story.

6. Assessment of Prior Odds ‘Appears to Fly in the Face of the Presumption of Innocence’

7. The Legal System is Not Supposed to be Subjective

Allen refers to

the desire to have disputes settled by reference to reality rather than the subjective state of mind of the decision maker

As you can see, a lot of the objections to probability here are continually raised in the frequentist vs. Bayesian interpretation of probability. But following in the steps of E. T. Jaynes, Robertson and Vignaux demonstrate that probability can be derived from some basic assumptions about propositional logic.

The authors then go on to explain the different “types” of probability, which is probably (heh) sowing confusion:

A priori probability refers to cases where there are a finite number of possible outcomes each of which is assumed to be equally probable. Probability refers to the chance of a particular outcome occurring under these conditions. Thus there is a 1 in 52 chance of drawing the King of Hearts from a pack of cards under these conditions and the axioms of probability can be used to answer questions like: ‘what is the probability of drawing a red court card?’ or ‘what is the probability of drawing a card which is (n)either red (n)or a court card?’

Empirical probability refers to some observation that has been carried out that in a series Y event X occurs in a certain proportion of cases. Thus surveys of weather, life expectancy, reliability of machinery, blood groups, will all produce figures which may then be referred to as the probability that X will occur under conditions Y.

Subjective probability refers to a judgement as to the chances of some event occurring based upon evidence. Unfortunately, Twining treats any judgement a person might make and might choose to express in terms of ‘probability’ as a ‘subjective probability’. This leads him to say that subjective probabilities ‘may or may not be Pascalian’.

[…]

This analysis of probability into different types invites the conclusion that ‘mathematical probability’ is just one type of probability, perhaps not appropriate to all circumstances… The adoption of any of the definitions of probability other than as a measure of strength of belief can lead to an unfortunate effect known as the Mind Projection Fallacy. This is the fallacy of regarding probability as a property of objects and processes in the real world rather than a measure of our own uncertainty. [my emphasis]

An instance of this fallacy is something called the Gambler’s fallacy. Indeed, in that post of mine I pretty much wrote what I emphasized in the quote above.

The authors then point out something pretty obvious: That flipping a coin is subject to the laws of physics. If we knew every single factor that went into each coin toss (e.g., strength of the flip, density of the air, the angle in which it was flipped, how long it spins in the air, the firmness of the surface it is landing on, etc.) we would know which side of the coin would be facing up without any uncertainty.

However, we don’t know every factor that goes into a coin toss, or drawing cards from a deck, or marbles from a jar (including the social influences of the marble picker). So there is a practical wall of separation between epistemology and ontology; a wall between how we know what we know and the actual nature of what we’re observing.

The authors continue with three minimal requirements for rational analysis of competing explanations:

Desiderata:

1. If a conclusion can be reasoned out in more than one way, then every possible way should lead to the same results.

2. Equivalent states of knowledge and belief should be represented by equivalent plausibility statements. Closely approximate states should have closely approximate expressions; divergent states should have divergent expressions.

The only way consistently to achieve requirement 2 is by the use of real numbers to represent states of belief. It is an obvious requirement of rationality that if A is greater than B and B is greater than C then A must be greater than C. It will be found that any system which obeys this requirement will reduce to real numbers. Only real numbers can ensure some uniformity of meaning and some method of comparison.

3. All relevant information should be considered. None should be excluded for ideological reasons. If this requirement is not fulfilled then obviously different people could come to different conclusions if they exclude different facts from consideration.

Clearly the legal system does exclude evidence for ideological reasons. Rules about illegally obtained evidence and the various privileges constitute obvious examples. It is important therefore, that there should be some degree of consensus as to what information is to be excluded in order to prevent inconsistent results. It is also important that we are explicit about exclusions for ideological reasons and do not pretend to argue that better decisions will be made by excluding certain evidence. This pretence is one of the justifications for the hearsay rule, for example, and it is clear from these cases from a variety of jurisdictions that judges are increasingly impatient with this claim.

The next section I will try to sum up where possible:

Rules to Satisfy the Desiderata:

1. The statement ‘A and B are both true’ is equivalent to the statement ‘B and A are both true’.

2. It is certainly true that A is either true or false.

The statment ‘A and B are both true’ can be represented by the symbol ‘AB’. So proposition 1 becomes ‘AB = BA’.

This is the basic rule for conjunction in propositional logic. P ^ Q is equivalent to Q ^ P.

How do we assess the plausibility of the statement AB given certain information I, symbolically P(AB | I)?

First consider the plausibility of A given I, P(A | I), then the plausibility of B given I and that A is true, P(B | A, I)… Thus in order to determine P(AB | I) the only plausibilities that need to be considered are P(A | I) and P(B | A, I). Since P(BA | I) = P(AB | I) (above)… [c]learly, P(AB | I) is a function of P(A | I) and P(B | A, I) and it can be show that the two terms are simply to be multiplied. This is called the ‘product rule’.

And because of the product rule, and because of requirement 2 above, the numbers we should assign to our certainties of “absolutely true” and “absolutely false” are 1 and zero, respectively.

Next, since we know that absolute certainty is 1, then the statement P(A, ~A) — that is, the probability that A is true or false — should be 1. And from that it follows that if P(A) + P(~A) = 1, then however much P(A) increases, P(~A) is equal to 1 minus P(A). This, the authors call the addition rule.

We may wish to assess how plausible it is that at least one of A or B is true…

P(A or B) = P(A)P(B | A) + P(A)P(~B | A) + P(~A)P(B | ~A)

Now, the first two terms on the right hand side can be expressed as:

P(A)P(B | A) + P(A)P(~B | A) = P(A)P(B or ~B | A) = P(A, B or ~B) = P(A)

And the third term, P(~A)P(B | ~A) as P(B, ~A) by the product rule.

Hence P(A or B) = P(A) + P(B, ~A).

This means that if we are interested in a proposition, C, which will be true if either (or both) A or B is true we can assess the probability of C from those of A and B. Thus, if the defendant is liable if either (or both) of two propositions were true then the probability that the defendant is liable is equal to the union of the probabilities of the two propositions. Courts appear to find this rule troublesome. The Supreme Court of Canada applied it correctly in Thatcher v The Queen but in New Zealand the Court of Appeal failed to apply it in R v Chingell and the High Court failed to apply it in Stratford v MOT.

3. If P(A | I)P(B | A, I) = P(B | I)P(A | B, I) (the product rule) then if we divide both sides of the equation by P(B | I) we get

P(B | I)P(A | B, I) / P(B | I) = P(A | I)P(B | A, I) / P(B | I)

The two P(B | I)’s on the left hand side cancel out and we have

P(A | B, I) = P(A | I)P(B | A, I) / P(B | I)

This is Bayes’ Theorem.

Cue Final Fantasy fanfare!

From here, the authors begin going over objections to probability and its utility in the law; objections that are borne of the misconceptions about probability and its utility outside the law. Most of these objectsions, in fact, are due to a frequentist view of probability; thinking of probability as a fundamental aspect of the object or event we’re looking at instead of a description of our uncertainty. As a matter of fact, that view should be put to rest by the authors’ demonstration of only using logic to derive Bayes Theorem. At no point did they use frequencies or any appeal to the nature of an object.

I did read one response to this article in the same publication in JSTOR, but it amounted to basically “This would be really hard to do” and not “this is invalid and/or it doesn’t follow from the rules of logic”.

 
Comments Off on Probability: The Logic of the Law

Posted by on June 16, 2015 in Bayes

 

The Rules Of Logic Are Just Probability Theory Without Uncertainty

bayes theorem

So it looks like for the summer I won’t be having any grad courses. Which means I can go back to blogging a bit and commenting on the multitude of things I find dealing with religion and/or rationality that I come across on the web. Maybe even finish reading some books I’ve bought and blogging about them too!

One thing I read on Quora is an intersection of religion and rationality: Using Bayes Theorem in history. Unfortunately this won’t be a post praising the argument; rather, it’ll be one explaining the author’s fail at rationality:

To begin with, it’s illustrative to note who uses Bayes Theorem to analyse history and who does not. In the first category we have William Lane Craig, the conservative Christian apologist, who uses Bayes Theorem to “prove” that Jesus actually did rise from the dead. And we also have Richard Carrier, the anti-Christian activist, who uses Bayes Theorem to “prove” that Jesus didn’t exist at all. Right away, a curious observer would find themselves wondering how, if this Theorem is the wonderful instrument of historical objectivity both Craig and Carrier claim it to be, two people can apply it and come to two completely contradictory historical conclusions. After all, if Jesus didn’t exist, he didn’t do anything at all, let alone something as remarkable as rise from the dead. So both Carrier and Craig can’t both be right. Yet they both use Bayes Theorem to “prove” historical things. Something does not make sense here.

Yes something doesn’t make sense here, and one can tell what that is by inference from the title of this current blog post.

As I wrote above, logic is just probability without the attendant uncertainty. Which should sorta be uncontroversial since logic and math are highly interconnected, just like math and probability are interconnected. I’m also not the first to point this out; I first read this connection in Jaynes.

But let me offer a couple of demonstrations. How about the basic syllogism with a conjunction as the major premise:


1. P ^ Q (true)
2. P (true)
Therefore Q

If I give a probability value to the major and minor premise, we can find out what conclusion follows:


1. P ^ Q (100%)
2. P (100%)
Therefore Q (100%)

This follows both logically and mathematically / probabilistically. If P * Q is 1, and P is 1, then Q must also be 1. So the answer is the same for both the formal logic formulation and the probabilistic formulation. Another example, using the same format:


1. ~(P ^ Q)
2. P (true)
Therefore ~Q

So if you can’t understand the fancy symbols, this reads that if you have a conjunction P and Q that is false, and you also know that P is true, then it follows necessarily that Q is false. The same conclusion will follow if we substitute probabilities:


1. P ^ Q (0%)
2. P (100%)
Therefore Q (0%)

This reads if the probability of P and Q is 0%, and we know that P is 100% then it must mean that Q is 0%. It’s a straightforward algebraic solve-for-x deal. The conjunction of the major premise of this case can be converted into a disjunction using DeMorgan’s law:


1. ~P v ~Q (true)
2. P (true)
3. Therefore ~Q

Does using probability yield the same conclusion?


1. ~P v ~Q (100%)
2. P (100%)
3. Therefore ~Q (100%)

Since this is a disjunction, we are no longer using multiplication to find the answer.

The point with this is that the underlying mechanisms are the same: conjunctions in propositional logic have the same “mechanism” for finding conclusions that math/probability do. The main difference between logic and probability is that logic is binary (yes/no) whereas probability is comparative. If we know that A is greater than B, and B is greater than C, then A must be greater than C. The shortcut for those sorts of comparisons is using numbers. And more relevantly, if history is about comparing explanations — which is a measure of uncertainty — the only clear way to do so is by using numbers: Probability.

So let’s substitute “Bayes theorem” with “propositional logic” in the original quote and see if this still makes sense:

To begin with, it’s illustrative to note who uses [modus tollens] to analyse history and who does not. In the first category we have William Lane Craig, the conservative Christian apologist, who uses Bayes Theorem to “prove” that Jesus actually did rise from the dead. And we also have Richard Carrier, the anti-Christian activist, who uses [modus tollens] to “prove” that Jesus didn’t exist at all. Right away, a curious observer would find themselves wondering how, if this [propositional logic] is the wonderful instrument of historical objectivity both Craig and Carrier claim it to be, two people can apply it and come to two completely contradictory historical conclusions. After all, if Jesus didn’t exist, he didn’t do anything at all, let alone something as remarkable as rise from the dead. So both Carrier and Craig can’t both be right. Yet they both use [modus tollens] to “prove” historical things. Something does not make sense here

And there we have it. It is indeed true that both Carrier and Craig have attempted to use propositional logic to defend their cases. This must mean that historians need to do away with using formal rules of logical inference because they can lead to different, contradictory conclusions. Clearly, this now means that the whole gamut of logical fallacies is now in play to argue anything one wants in historical analysis!

This reminds me of how Creationists and other anti-science types think that the scientific enterprise is wholly corrupt because sometimes the scientific method produces two contradictory studies.

But yes. Both probability and logic (and science) follow the GIGO rule: Garbage in, garbage out. We can’t argue against a tool just because it follows GIGO.

 
Comments Off on The Rules Of Logic Are Just Probability Theory Without Uncertainty

Posted by on June 5, 2015 in Bayes

 

What are the odds that Jesus rose or Moses parted the waves? Even with the best witnesses, vanishingly small

I claim no great originality for my argument. I’m borrowing from the great Scottish philosopher David Hume, particularly Section 10 of his magnificent Enquiry Concerning Human Understanding (1748). If there is any novelty in my presentation, it owes to the marriage of Hume’s ideas with a famous theorem in probability theory proposed by the Reverend Thomas Bayes in ‘An Essay towards solving a Problem in the Doctrine of Chances’ (1763). The technical details, fortunately, can be put to the side for our purposes.

Read more at Aeon.

 
Comments Off on What are the odds that Jesus rose or Moses parted the waves? Even with the best witnesses, vanishingly small

Posted by on December 5, 2014 in Bayes, rationality

 

“There’s No Evidence For The Existence of God”

IMG_4559.JPG

I used to think that the title-quote of this blog post was a good rejoinder when people asked me why I didn’t believe in any sort of god. Nowadays, I sort of grimace a little when I hear atheists use that phrase. Because now I consider myself a Bayesian. And for Bayesians, “no evidence” means something a lot different than how other people use “no evidence”.

As a Bayesian, if I say there is evidence for some hypothesis, then this means that P(H | E) > P(H). If I say there is evidence against some hypothesis, then this means that P(H | E) < P(H). Most importantly, as a Bayesian, I don't just update once; I update on multiple pieces of evidence to arrive at a provisional posterior probability about some claim. And it’s provisional because there’s always new evidence to discover. In this sense, and in my opinion, agnosticism is probably the closest mainstream or Traditional Rationality analog to being a Bayesian.

But what could it mean if I say there is no evidence for some claim? And does this apply to the concept of god?

Let’s compare two conditional probabilities: The probability of having some datum given that god exists and the probability of having some datum given the nonexistence of god. P(D | G) and P(D | ~G). So, assuming god exists, what would the most basic evidence be, and would this be more or less likely given the nonexistence of god?

Some axioms of probability to remind you of: P(E | H) + P(~E | H) = 100%. That is, the probability of the evidence given the hypothesis is true plus the nonexistence of (or existence of some other) evidence given the hypothesis is true must exhaust all possibilities. Meaning they add up to 100%. This is how you know that you have a 1/6 chance of rolling a 4 given a fair die. P(Roll 4 | Fair Die) + P(Roll Other Number | Fair Die) = 100%.

Given that, most simplistically, stories about the existence of god are more likely than no stories about god given that god exists. Meaning that P(D | G) > P(~D | G). And the opposite for the alternative: stories about the existence of god are less likely given that no god exists than no stories about god given that god doesn’t exist. Meaning, also, that P(D | ~G) < P(~D | ~G). If I can say this another way, if god did exist we would have more stories about him than if god didn’t exist: P(D | G) > P(D | ~G). Think about it. There are more non stories of things that don’t exist than there are stories of things that don’t exist. Sure there are stories of unicorns and unicorns don’t exist. But what about the trillions of things that don’t exist that we concurrently don’t have stories of? They are legion.

Basically, anecdotes about the existence of god are evidence that god exists. I go over this in the post Logical Fallacies as Weak Bayesian Evidence: Argument from Anecdote. This all might seem a bit counterintuitive, but relying on intuition to make decisions is just another way of saying that the decision conforms to your biases. Which is usually not a good thing.

So what does no evidence look like? To me, this would be some conditional probability that is equal to all alternatives. One where Bayes Factor is 1. In other words, the evidence exists independently of the hypothesis.

This all being said, I think there is evidence for the existence of god. I actually concede a little bit of relatively strong evidence for the existence of god. But, there is so much more evidence against the existence of god because god, as defined by laypeople and sophisticated theologians alike, is unfalsifiable. For most other data besides morality, god is the equivalent of a trillion^trillion sided die and expecting to roll a 3, and comparing that to the probability of rolling a 3 given normal die. This is what happens when one conceives of an all-powerful god; there’s nothing an all powerful god can’t explain.

So yes, there is evidence for the existence of god. But it is underwhelming in comparison to the orders upon orders of magnitude of the evidence — Bayesian evidence — against the existence of god.

 

Grad School

20140430-110636.jpg

So I’m starting grad school for computer science in about a month. This is on top of having a normal 9 – 5 (well, 8:30 – 6) job. Meaning that in a little while I’ll probably have less time for blogging; at least, blogging anything with more than some passing thoughts and/or cool articles I find about religion.

Since I’m continuing my compsci schooling towards an M.S. I thought I’d try brushing up on my programming besides the meager tasks that I do for work (right now I’m more of a “software engineer”, meaning I mainly concentrate on the process aspect of software development with some coding on the side if required) so I’m writing a Java app that is — you guessed it — computing Bayes Theorem! I’m going to add it as an executable to my static website where I’m going to be doing some other web dev for a page dedicated to how probability theory is the logic of science. The page isn’t up yet, but it’ll get there eventually.

It was actually really simple to write the backend code for BT, but one neat little thing I discovered while coding for it, ironing out all of the nooks and crannies of BT, was combining likelihood ratios/Bayes factors. Here it is, better described over at Overcoming Bias:

You think A is 80% likely; my initial impression is that it’s 60% likely. After you and I talk, maybe we both should think 70%. “Average your starting beliefs”, or perhaps “do a weighted average, weighted by expertise” is a common heuristic.

But sometimes, not only is the best combination not the average, it’s more extreme than either original belief.

Let’s say Jane and James are trying to determine whether a particular coin is fair. They both think there’s an 80% chance the coin is fair. They also know that if the coin is unfair, it is the sort that comes up heads 75% of the time.

Jane flips the coin five times, performs a perfect Bayesian update, and concludes there’s a 65% chance the coin is unfair. James flips the coin five times, performs a perfect Bayesian update, and concludes there’s a 39% chance the coin is unfair. The averaging heuristic would suggest that the correct answer is between 65% and 39%. But a perfect Bayesian, hearing both Jane’s and James’s estimates – knowing their priors, and deducing what evidence they must have seen – would infer that the coin was 83% likely to be unfair.

That is because a perfect Bayesian would be combining their data, not simply taking an average of their posteriors. Which makes more sense if you think about it. If one group of people concluded that the world was round and another group of people thought the world was flat, it wouldn’t make sense to take an average of the two conclusions and say that the world must be shaped like a calzone. You would want the data that they used to arrive at their conclusions and update on that. Taking an average of the two is a social solution — meant to save people’s egos — not one that’s actually attempting to get at a more accurate model of the world.

It seems like combining likelihood ratios is actually pretty straightforward. Think about the conjunction fallacy. The probability of X% combined with the probability of Y% isn’t X% + Y%, or the average of X% and Y%, but rather X% * Y%. So combining likelihood ratios follows the same logic.

Again, from OB:

James, to end up with a 39% posterior on the coin being heads-weighted, must have seen four heads and one tail:

P(four heads and one tail| heads-weighted) = (0.75^4 * 0.25^1) = 0.079. P(four heads and one tail | fair) = 0.031. P(heads-weighted | five heads) = (0.2 * 0.079)/(0.2 * 0.079 + 0.8 * 0.031) = 0.39, which is the posterior belief James reports.

Jane must similarly have seen five heads and zero tails.

Plugging the total nine heads and one tail into Bayes’ theorem:

P(heads-weighted | nine heads and a tail) = ( 0.2 * (0.75^9 * 0.25^1) ) / ( 0.2 * (0.75^9 * 0.25^1) + 0.8 * (0.5^9 * 0.5^1) ) = 0.83, giving us a posterior belief of 83% that the coin is heads-weighted.

So what I call the success rate — P(E | H) — is represented here as P(four heads and one tail | heads-weighted). P(E | ~H), the alternative hypothesis, is P(four heads and one tail | fair). P(E | H) / P(E | ~H) = 0.079 / 0.031 = 2.531 for James’ likelihood ratio. Jane’s numbers are P(E | H) / P(E | ~H) = 0.237 / 0.031 = 7.593. The combined likelihood ratio is 19.221, which is how much evidence is needed to move the prior from 20% to 83%; that likelihood ratio also happens to be the other two likelihoods multiplied together, 2.531 * 7.593.

Something like this is very handy if you have two people with disparate priors. Two people can have different priors, but as long as you’re updating on the same evidence, the priors will eventually converge. Combining likelihood ratios ensures that both parties are updating on the same evidence, since the likelihood ratio is what is determining how much your prior moves.

 
Comments Off on Grad School

Posted by on April 30, 2014 in Bayes

 
 
NeuroLogica Blog

My ὑπομνήματα about religion

Slate Star Codex

"Talks a good game about freedom when out of power, but once he’s in - bam! Everyone's enslaved in the human-flourishing mines."

Κέλσος

Matthew Ferguson Blogs

The Wandering Scientist

Just another WordPress.com site

NT Blog

My ὑπομνήματα about religion

Euangelion Kata Markon

A blog dedicated to the academic study of the "Gospel According to Mark"

PsyPost

Behavior, cognition and society

PsyBlog

Understand your mind with the science of psychology -

Vridar

Musings on biblical studies, politics, religion, ethics, human nature, tidbits from science

Maximum Entropy

My ὑπομνήματα about religion

My ὑπομνήματα about religion

My ὑπομνήματα about religion

atheist, polyamorous skeptics

Criticism is not uncivil

Say..

My ὑπομνήματα about religion

Research Digest

My ὑπομνήματα about religion

Disrupting Dinner Parties

Feminism is for everyone!

My ὑπομνήματα about religion

The New Oxonian

Religion and Culture for the Intellectually Impatient

The Musings of Thomas Verenna

A Biblioblog about imitation, the Biblical Narratives, and the figure of Jesus