(Magic: A probability game)
So on the Facebook I had a conversation with someone regarding what constitutes a good explanation. The person rejoined with something like “real life isn’t a probability game” so I gave up the thread at that.
But why would someone think that the laws of probability don’t apply to everyday situations, or for making sense of the world? We intuitively use probability to make many of our decisions during the day. The decision to not walk down a dark alley in a bad neighborhood at 3am is one based on probability. There is neither a zero percent chance that you’ll get robbed nor a 100% chance that you will get robbed. The decision to fly in a plane, get in a car, check the weather, type on a keyboard… these are all decisions based on knowing the odds in your favor.
There are a few posts on Less Wrong about how probability follows from logic, and also a few books. But I think those are a bit too high level for the type of person who would think that probability doesn’t apply for making sense of the world. So something more intuitive and obvious is needed.
Let’s see if I can try this “probability follows from logic” thing with as little assumptions as possible.
Assumptions: A = A. A != ~A. Meaning that if I say the word “ball”, you know I mean “ball” and not “not-ball”. This is a fundamental assumption for normal human communication to be possible. You’ve assumed this assumption in order to comprehend this post!
A normal argument:
Premise 1: If A Then B
Premise 2: A
This is also pretty straightforward. The word to describe this inference if you want to run someone over with fancy Latin words is modus ponens.
Another type of inference:
P1: If A Then B
Not as straightforwardly intuitive as modus ponens. This one is called modus tollens. Which means that if we have a material conditional where the consequent is false, then asserting that the antecedent is also false follows logically.
These types of inferences color a lot of our explanatory language. If my computer is on, then I can type this blog post. My computer is on, so a reasonable conclusion — based on the premise that if my computer is on then I can type this post — is that I can type this post. It seems very redundant because we should know this sort of stuff already.
Now for the challenging part.
P1: A does not equal C
P2: If A Then B
P:3 If C Then B
In this case, if you write as the conclusion A or C, that is a logical fallacy called affirming the consequent. Due to the logic of truth tables, you can’t conclude the antecedent of a material conditional if only given the consequent as a premise.
What if you want to find out the true antecedent for B? You could just write the conclusion as
A or C but you also have to take into account that there might be other causes for B besides A and C; that would be your unknown ???. So it would follow logically that the conclusion could be A or C or ???. But that doesn’t really help, does it? The conclusion could always be written as ??? and we would be “correct”.
Or say you were doing a code review of a program that had two separate logic gates:
If (A == true) then B;
If (C == true) then B;
You run the program and B executes. Which condition in the code was satisfied to run B? Was it A or C? Or some other unknown state? Assuming the same argument as above (A is not equal to C) if it were me, I would look for other evidence that also followed from A or C being run. But we also don’t know what other processes or code could also produce B so those have to be included in the possibilities.
To make this easier, let’s say that instead of a program, we’re dealing with something in real life. The prototypical example is wet grass. If it rains, then the grass is wet. If the sprinklers turn on, then the grass is wet. When I say “rain” I don’t secretly mean “sprinklers” or vice versa; rain is not equal to sprinklers (though it could rain and the sprinklers turn on, but for simplicity’s sake let’s assume that doesn’t happen). This also follows the same rules of logic as above. Just because the grass is wet doesn’t mean that it rained nor that the sprinklers turned on; that is the same affirming the consequent fallacy because some unknown other causes might make the grass wet.
So let’s say we ran this code so 100 times, or in real life checked the grass 100 days in a row. 10 out of those 100 times we checked the grass (i.e. ran the program), the water was wet (B executed). Of those 10 times, 2 times that the grass was wet was due to the sprinklers. 5 times the grass was wet, it rained. And the remaining 3 times the grass was wet it neither rained nor the sprinklers turned on, but was due to some unknown cause.
What makes things easy in this case is the assumption that every time it rains then the grass is wet and every time the sprinklers turn on the grass is wet. But what if that wasn’t the case? It follows that the grass being wet given that it rained or the sprinklers turned on were true would be some sort of fraction, depending on how tightly coupled rain/sprinklers were to wet grass. So to run with this thought for a moment, let’s say that 5 out of 6 times that it rained, the grass was wet and 2 out of 4 times that the sprinklers turned on, the grass was wet. Maybe the reason for the inconsistencies is due to a really arid climate that dries out the grass before it’s checked. Maybe the rain/sprinklers didn’t last long enough to really make the grass wet. Who knows.
Now, we have a bunch of fractions floating around. We need to keep track of them.
10 out of 100 days the grass was wet.
of those 10 times:
2 out of 10 times the grass was wet was due to sprinklers
5 out of 10 times the grass was wet was due to rain
3 out of 10 times the grass was wet was due to some other reason
What about the total times that the sprinklers turned on out of those 100 days, and how many times it rained during those days? Those are also numbers we need to be aware of. In this hypothetical, it rained 6 times total and the sprinklers turned on 4 times total. So more fractions:
6 out of 100 days it rained
4 out of 100 days the sprinklers turned on
It might be better to start thinking in terms of probability, right? So what is the probability that the grass is wet given that it rained? Well that would be the total number of times that the grass was wet given that it rained, or Pr(grass wet | rained). The grass was wet 10 times, and of those 10 times, 5 were due to rain; 5 out of 10 or 50%.
So let’s back up a bit. We are starting to get into the laws of probability. So as an example, what is the probability of flipping heads twice in a row? This would be 50% * 50%, which is 25%. Meaning that if you flipped a coin 100 times in a row, you should end up with the sequence heads-heads 25 times. This is a straight multiplication because each flip is independent of each other; my coin choice in the first flip doesn’t determine the type of coin tossed in the second flip. However, if the choices are dependent, then a different multiplication rule applies.
Let’s say that I pick a card from a deck, a queen. What is the probability of picking a queen? 4 out of 52. If I then ask what the probability is of picking a jack of spades next, my first choice obviously affects my second choice. The probability of picking a jack of spades is no longer 1 out of 52 but 1 out 51, since the queen has already been removed. This is dependence. It then becomes Pr(Queen) * Pr(Jack of Spades | Queen). This is 4/52 * 1/51, which is 4 out of 2652. Meaning that the sequence “picked a queen” and then “jack of hearts” is really unlikely since we are basing “jack of hearts” on the previous condition of “picked a queen”. They are dependent.
It just so happens that Pr(Queen)*Pr(Jack of Spades | Queen) = Pr(Jack of Spades)*Pr(Queen | Jack of Spades). Try it out:
4/52 * 1/51 = 1/52 * 4/51
This “x * x given y” also applies to the coin flips. Since the coin flips are independent, the Pr(heads) is equal to Pr(heads | heads) so to find out the probability of flipping heads twice in a row is Pr(heads) * Pr(heads | heads). It is still 25%.
Back to the task at hand. We know that Pr(grass wet | rained) is 50%. We also know that the probability of it raining at all, Pr(rained), is 6 out of 100, or 6%. This means we have Pr(rained)*Pr(grass wet | rained) = Pr(grass wet)*Pr(rained | grass wet). Meaning that the next time we find the grass wet, and we want to find out Pr(rained | grass wet), we have all of the info we need to calculate this: We know what Pr(rained) is, what Pr(grass wet | rained) is, and Pr(grass wet) is. So if we want to figure that out, we divide both sides of that equation Pr(rained)*Pr(grass wet | rained) = Pr(grass wet)*Pr(rained | grass wet) by Pr(grass wet). What does that formula become?
BAYES THEOREM! Pr(rained | grass wet) = Pr(grass wet | rained)*Pr(rained) / Pr(grass wet).
We can also use the same formula if we want to find out Pr(sprinklers on | grass wet), substituting what we have for Pr(rained) with Pr(sprinklers on). And once we know that Bayes Theorem applies in figuring out whether it rained or if the sprinklers turned on, we know that all other probabilistic logic — like extraordinary claims require extraordinary evidence, precision / falsifiability, absence of evidence IS evidence of absence, and independence — also apply.
We can go back to our original plain logic inferences and see if probabilistic logic gives us the same answers. Let’s try modus ponens first.
P1: Pr(B | A) = 100% (“probability B given A” is the equivalent of saying “If A Then B”)
P2: Pr(A) = 100%
C: Pr(B) = Pr(B | A)*Pr(A) = 100%
P1: Pr(B | A) = 100%
P2: Pr(B) = 0%
C: Pr(A) = Pr(A | B)*Pr(B) / Pr(B | A) = 0%.
Let’s see how fallacious affirming the consequent is:
P1: Pr(B | A) = 100%
P2: Pr(B | C) = 100%
P3: Pr(B) = 100%
C:Pr(A | B) = Pr(B | A)*Pr(A) / Pr(B)
C:Pr(C | B) = Pr(B | C)*Pr(C) / Pr(B)
As you can see, since we don’t know what the probability of A or C is, we can’t conclude anything about A or C given that B happened. If Pr(A) were 100% then B would also be 100% and that would be modus ponens. Anything less than 100% and we can’t safely conclude A to the exclusion of C or any other conclusion just like affirming the consequent says.
What we should be aware of, however, is that modus ponens itself only works if we have 100% certainty in our premises. If Pr(B | A) was 100% and Pr(A) was 50%, then modus ponens fails, since our conclusion can only be as strong as our weakest premise. Hence Occam’s Razor.
Notice that the logical fallacy affirming the consequent can be viewed as weak Bayesian evidence: If Pr(B | A) > Pr(B | ~A), then Pr(A) is probably more likely than Pr(~A)… which means affirming the consequent — B — might be weak or strong Bayesian evidence for A.
So real life is certainly a probability game. It’s much more of a probability game than a logic game, and we can’t function at all in society without adhering to a system of logic. Like I said, you assumed logic to even comprehend the words in this post, even though logic only works if we have 100% certainty in our premises. We live in the real world, where 100% certainty in anything isn’t reasonable. We live in a world of uncertainty, not logic, making the laws of probability more relevant than logic even though, again, we need to assume logic to even comprehend basic communication and making sense of our everyday lives.