So I have an entire web page dedicated to describing logical fallacies that I encounter(ed) a lot when arguing with people on the web. In the course of learning Bayes’ Theorem, and knowing that valid math should be applicable to valid logic, I’ve been thinking of ways to apply Bayes’ to logical arguments. Due to Bayes’, we already know why extraordinary claims require extraordinary evidence, why the Prosecutor’s Fallacy is a fallacy, why something that equally explains everything in reality explains nothing, and that — contrary to its oft-repeated negation — absence of evidence *could be* evidence of absence.

Just to go over these implications and fallacies of Bayesian reasoning quickly, here are their examples.

__Extraordinary Claims Require Extraordinary Evidence__

What does it mean when we say that some claim is extraordinary? It means that it is out of the ordinary, or not very mundane. In other words, and extraordinary claim is any claim that has a low probability of being true. So if the existence of god is ordinary, it would be a high probability claim. If the existence of god is extraordinary, then it would be a low probability claim.

Let’s take an extraordinary claim, like aliens abducted me and that’s the reason why I didn’t go to work yesterday. We would need some extraordinary — that is *low probability* — evidence to corroborate it. Simply missing a day of work is not *improbable*. Walking into work the next day with some super awesome alien technology is very improbable; it doesn’t happen every day.

So our Bayes’ formula is this: P(H | E) = P(E | H) * P(H) / P(E). P(H) is the extraordinary (i.e. low probability) claim. P(E) would be the evidence. If P(H) is close to or equal to P(E), then they will cancel each other out and the remainder will be P(E | H). If P(E) is nowhere near P(H), then they will *not* cancel each other out and the Bayesian update might be closer to P(H)’s original improbability.

Let’s say the probability of getting abducted by aliens is .1%, and the probability of missing work *given* that I was abducted by aliens is 99%. What is the probability of missing work *period*? Or, how many people missed work yesterday? This is a much, much higher number than .1%. Maybe 10% of the entire population decided not to go to work yesterday for various other mundane reasons. This, in turn, only updates our prior from .1% to .99%.

On the other hand, let’s say that there is a .1% probability of getting a super awesome alien weapon. Given that I was abducted by aliens, there would also be a high probability of getting the weapon just like there would be a high probability of me missing work. This in turn updates the prior from .1% to around 99%. The .1% in the numerator is canceled out by the .1% in the denominator and we are left with the conditional probability being equal to the posterior probability.

And that’s why extraordinary claims require extraordinary evidence.

__Prosecutor’s Fallacy__

Ok, let’s say that we have our extraordinary evidence. Does this mean that we have free reign to posit any extraordinary claim? If the claim is *too* extraordinary then this renders it null and void; the newly posited claim can’t be too out of left field or it will make the evidence *relatively* high probability.

Let’s say that you were driving along one day and started getting lost in your thoughts. You get so distracted by your thoughts you wander off the road getting into a minor accident. It just so happens that a mile down the road there is a huge pileup, one that you would have been invovled in had you not wandered off the road. “Aliens used space technology to invade my thoughts to prevent me from getting in the pileup!” you say.

Avoiding a pileup in this fashion is highly unlikely. So given that aliens *did* invade your thoughts, it would make sense of the evidence. Yet how much *more* unlikely is it that aliens would interfere in your life in this way? It’s got to be a few orders of magnitude more improbable than avoiding a pileup like this. Focusing on the conditional (i.e. *given* that X is true accounts for Y) while ignoring the low prior probability is a Prosecutor’s Fallacy (or Base Rate Fallacy).

Looking at the simple formula for Bayes’ P(E | H) * P(H) / P(E) you can see what happens if P(H) is zero or near zero. So it doesn’t matter how high P(E | H) is, and ignoring the prior in this case would be a Prosecutor’s Fallacy.

__Something That Equally Explains Everything Explains Nothing__

Let’s say you get diagnosed with some extremely deadly cancer. You go to the doctor and he says “Out of the 1 million people diagnosed with this cancer in the past 50 years, only one has lived past 6 months after diagnosis at this stage.”. You look on the Internet and find out that there are only 10 people on the entire planet right now with your cancer.

One year later you are still alive, the other 9 people are dead. “Praise the aliens!” you say. “They kept me alive even though those other people died. They are so benevolent, and my survival is evidence of their beneficience!”. You go to a friend of yours with your revelation, and she asks “But what about the other 9 people who died?”. You reply “That is also evidence of the aliens’ benevolence, because they ended their suffering!”

Barring the arrogance displayed, is there something wrong with this reasoning? Besides the previous fallacies? Sure, *given* that the aliens are good beings you would survive one year with the cancer. This would be P(Surviving Cancer | Good Aliens). But what about the compliment, P(Not Surviving Cancer | Good Aliens)? If one asserts a high probability of surviving cancer given the goodness of the aliens, and in turn then asserts a high probability of *not* surviving the cancer (i.e. 9 out of 10 people died in this scenario) given the goodness of the aliens, this leads to a contradiction in probability terms. Because P(Surviving Cancer | Good Aliens) + P(Not Surviving Cancer | Good Aliens) = 1.00. If P(Surviving Cancer | Good Aliens) is .99, this necessitates that P(Not Surviving Cancer | Good Aliens) is 1.00 – .99, or .01.

If the probability of both conditionals is equal, then one cannot be more than .5. And this is only with binary evidence. If there are multiple possible *exclusive* outcomes that all evidence the goodness of the aliens, then this further diminishes the conditional probability if you assert that they are all equally likely. If there are 1,000 mutually exclusive instances (i.e. “exclusive” meaning examples like you can’t both live and die at the same time, be married and a bachelor at the same time, be in NYC, Tokyo, Afhanistan, and London all at the same time, etc.) that all have the same conditional probability, the conditional probability can only be 1 / 1000.

__Absence of Evidence is Evidence of Absence__

According to the standard definition of “evidence”, this is any event or fact that supports some hypothesis. In Bayesian terms, this means any fact or event that increases the prior probability of some hypothesis, whether weak or strong. So if P(H | E) > P(H), this should mean that P(H | ~E) < P(H). So with the alien abduction example, if I had some super awesome alien weapon this would corroborate my alien abduction:

P(Alien Abduction | Alien Weapon) = P(Alien Weapon | Alien Abduction) * P(Alien Abduction) / P(Alien Weapon)

= .99 * .01 / .01

= .99

If I didn’t have some super awesome alien weapon, or absence of evidence, it would look like this:

P(Alien Abduction | No Alien Weapon) = P(No Alien Weapon | Alien Abduction) * P(Alien Abduction) / P(No Alien Weapon)

= .01 * .01 / .99

= .0001 / .99

= .000101

Remember the compliments: P(E) + P(~E) = 1.00. P(E | H) + P(~E | H) = 1.00.

Or, P(Alien Weapon) + P(No Alien Weapon) = 1.00.

P(Alien Weapon | Alien Abduction) + P(No Alien Weapon | Alien Abduction) = 1.00.

So if P(Alien Weapon | Alien Abduction) is .99, this means that P(No Alien Weapon | Alien Abduction) = 1.00 – .99. This then creates our Bayes’ Theorem for absense of evidence:

BAYES’ THEOREM FOR ABSENCE OF EVIDENCE:

P(H | ~E) = P(~E | H) * P(H) / P(~E).

So if you want to prove that absense of evidence either does or does not mean evidence of absense, just use that formula. Absense of evidence only means evidence of absense if the presence of evidence is evidence of presence.

In this alien abduction example, the prior moved *down* from .01 to .000101 due to the absence of evidence.

__Anyway…__

But it looks like I’m already late to the game. People have already attempted to apply Bayes’ to logical arguments and fallacies: Fallacies as Weak Bayesian Evidence. Look at the Argument from Ignorance:

1. Prior beliefs influence whether or not the argument is accepted.A) I’ve often drunk alcohol, and never gotten drunk. Therefore alcohol doesn’t cause intoxication.

B) I’ve often taken Acme Flu Medicine, and never gotten any side effects. Therefore Acme Flu Medicine doesn’t cause any side effects.

Both of these are examples of the argument from ignorance, and both seem fallacious. But B seems much more compelling than A, since we

knowthat alcohol causes intoxication, while we also know that not all kinds of medicine have side effects.

2. The more evidence found that is compatible with the conclusions of these arguments, the more acceptable they seem to be.C) Acme Flu Medicine is not toxic because no toxic effects were observed in 50 tests.

D) Acme Flu Medicine is not toxic because no toxic effects were observed in 1 test.

C seems more compelling than D.

3. Negative arguments are acceptable, but they are generally less acceptable than positive arguments.E) Acme Flu Medicine is toxic because a toxic effect was observed (positive argument)

F) Acme Flu Medicine is not toxic because no toxic effect was observed (negative argument, the argument from ignorance)

Argument E seems more convincing than argument F, but F is somewhat convincing as well.

The Argument from Ignorance is just another version of the Absence of Evidence argument. Absence might be *strong or weak* evidence, but it’s still *evidence* if it moves the prior in any direction. This is different than “*I don’t know how things work, therefore aliens”* argument, which should be renamed to the Bill O’Reilly Fallacy.

So the Argument from Ignorance is weak evidence, since we aren’t taking into account *both* the success rate and the false positive rate. It’s incomplete Bayesian reasoning, but it’s not inherently fallacious. It *might* be fallacious or it might *not* be fallacious.

How about circular reasoning?

A. God exists because the Bible says so, and the Bible is the word of God.

B. Electrons exist because we can see 3-cm tracks in a cloud chamber, and 3-cm tracks in cloud chambers are signatures of electrons.

[…]

The “circular” claim reverses the direction of the inference. We have sense data, which we would expect to see if the ambiguous interpretation was correct, and we would expect the interpretation to be correct if the hypothesis were true. Therefore it’s more likely that the hypothesis is true. Is this allowed? Yes! Take for example the inference “if there are dark clouds in the sky, then it will rain, in which case the grass will be wet”. The reverse inference, “the grass is wet, therefore it has rained, therefore there have been dark clouds in the sky” is valid. However, the inference “the grass is wet, therefore the sprinkler has been on, thefore there is a sprinkler near this grass” may also be a valid inference. The grass being wet is evidence for both the presence of dark clouds and for a sprinkler having been on. Which hypothesis do we judge to be more likely? That depends on our prior beliefs about the hypotheses, as well as the strengths of the causal links (e.g. “if there are dark clouds, how likely is it that it rains?”, and vice versa).

Thus, the “circular” arguments given above are actually valid Bayesian inferences. But there is a reason that we consider A to be a fallacy, while B sounds valid. Since the intepretation (the Bible is the word of God, 3-cm tracks are signatures of electrons) logically requires the hypothesis, the probability of the interpretation cannot be higher than the probability of the hypothesis. If we assign the existence of God a very low prior belief, then we must also assign a very low prior belief to the interpretation of the Bible as the word of God. In that case, seeing the Bible will not do much to elevate our belief in the claim that God exists, if there are more likely hypotheses to be found.

So it looks like some fallacies might actually not be fallacies. It could be the amount of confidence we place in the logic that is the fallacious reasoning. So for absence of evidence being evidence of absence, it’s not necessarily a fallacy. It could be that the lack of evidence only diminishes the prior probability by .1%. The problem would come about by relying *only* on the absence to prove a point, when a difference of .1% isn’t much to write home about.

Since probability is a form of induction, it might be that all fallacies of induction could be expressed using Bayes’ theorem. Here is a longer version of the post.

Timothy

March 21, 2012 at 5:22 pm

The more evidence found that is compatible with the conclusions of these arguments, the more acceptable they seem to be.I don't know if you've talked about this before, but this sentence is how I understand the way we infer causal relations based on scientific experiments.As you probably know, no experiment can ever prove something true. When you perform your experiment, you think "well, if my hypothesis is true, then I'll get the result I expect." We could write that as If H ("hypothesis is true"), then R ("we'll get this result"). This is a true statement, so we know:1. H -> RThen we run our experiment and we get the result we expected. Now we know that R is true.2. RHowever, this is not a valid argument:1. H -> R2. R–therefore—–3. HThat's getting the logic backwards. No matter how many experiments you run, you can never prove that getting the result you expected was due to your hypothesis being correct. R could have been caused by something else!But this is why we continue piling on the evidence. By replicating the experiment, we change the random confounding variables that could have caused R (such as time, place, who the experimenters were, who the participants were, etc.). We also replicate using slightly different versions of the initial experiment, which are all designed to test the same phenomenon, but from slightly different angles. We can label the Results we get in each of these experiments with numbers (R1, R2). The point is that our one hypothesis can explain all of them. So whereas before we started out with "If our hypothesis is true, we will get this result" (H -> R), now we can say "If our hypothesis is true, we will getall theseresults:H -> R1H -> R2H -> R3H -> R4And so on. If R was caused by something other than H, you would expect some of these experiments to fail, since the chance variable that was responsible for our findings would have been missing in at least one of them. When enough experiments based on a single hypothesis succeed, we decide that the probability is greater that that is because our hypothesis was right, rather than the result was due to chance every single time.J. Quinton

March 23, 2012 at 1:11 am

This is one of the reasons why religionist thinking fails. There's not even an attempt at replication but jumping straight to the conclusion after one "positive" hit. Not only that, but the disconfirming hits are ignored.