Probabilistic Thinking vs. Storytelling (Incl Counterfactuals, Expected Value, Halo Effect / Association Bias)

If this is your first time reading, please check out the overview for Poor Ash’s Almanack, a free, vertically-integrated resource including a latticework of mental models, reviews/notes/analysis on books, guided learning journeys, and more.

Probabilistic Thinking vs. Storytelling Mental Model: Executive Summary

If you only have three minutes, this introductory section will get you up to speed on the probabilistic thinking vs. storytelling mental model.

The concept in one quote:

It’s more interesting to live not knowing than to have answers which might be wrong. I have approximate answers and different degrees of certainty about different things. I don’t feel frightened by not knowing things. – Richard Feynman Click To Tweet

(from “ The Pleasure of Finding Things Out” – PFTO review + notes)

Key takeaways about probabilistic thinking: the world is probabilistic thanks to complexity and other phenomena; even with perfect data, we can rarely predict the future with absolute certainty.  Yet we’re hardwired for storytelling: we tend to be overconfident and believe with absolute certainty that we see patterns which may not be really there.  Adopting the counterintuitive, process-focused probabilistic thinking approach helps us make better decisions.

Three brief examples of probabilistic thinking / storytelling:

Putting the “I” in identical.  In Sam Kean’s engaging exploration of genetics, “ The Violinist’s Thumb ( TVT review + notes), Kean notes that genes:

deal in probabilities, not certainties.”  

He points out that clones would only be as alike to their parents as identical twins are to each other: which is, in some cases, surprisingly not.  

For example, despite having identical genomes and extremely similar environments through age 18, if one identical twin develops schizophrenia, the other one only has a 30 – 50% chance (probability) of getting it – still a very high probability relative to the 1% base rate,but far lower than we’d intuitively imagine.

That famous chicken story.  That chicken story is covered in about five hundred of the seventy-five books cited on this site (don’t worry about the math), including Daniel Schacter’s “ The Seven Sins of Memory ( 7SOM review + notes).

So I won’t go into it too deeply – but there’s a famous story wherein a split-brain study subject (whose left and right hemispheres were unable to communicate) points to a shovel with one hand after being primed with a snow scene, and points to a chicken with the other after being primed with an image of a chicken claw.

Experimenters asked why he was pointing to the shovel.  Since his left brain (the narrator) is the one doing the talking, it made up a story: to clean out the chicken shed.

An alternate ending where Deadpool, after escaping Colossus, time-travels to become Harry Truman’s economist.  

As Philip Tetlock references in the awesome “ Superforecasting ( SF review + notes) – which we’ll get to – President Harry Truman was frustrated by his economists, and wanted one that didn’t have so many hands:

“Give me a one-handed Economist. All my economists say ‘on hand…’, then ‘but on the other…”

Truman wasn’t alone in his preference for certainty over probability, as we’ll explore – which has detrimental consequences for the quality of our decision-making.

If this sounds interesting/applicable in your life, keep reading for unexpected applications and a deeper understanding of how this interacts with other mental models in the latticework.

However, if this doesn’t sound like something you need to learn right now, no worries!  There’s plenty of other content on Poor Ash’s Almanack that might suit your needs. Instead, consider checking out our learning journeysour discussion of theinversionmindfulness / cognitive behavioral therapy, or culture / status quo bias mental models, or our reviews of great books like “ The Design of Everyday Things” ( DOET review + notes), “ Misbehaving” ( M review + notes), or “ Why We Sleep” ( Sleep review + notes).

Storytelling Mental Model: A Deeper Look (+ Association Bias)

As humans, we’re hardwired to search for causality.  We don’t have to be taught this. It starts young – real young.  

For example, in “ The Genius of Birds – Bird review + notes – Jennifer Ackerman calls causal reasoning “one of our most powerful mental abilities” and points out:

“An infant only seven to ten months old shows surprise if a beanbag is thrown from behind a screen and then the screen is lifted to real a toy block rather than an expected human causal agent such as a hand.”

There is, of course, the standard evolutionary-biology justification: it’s safer to assume that a rustle in the grass is caused by a creeping tiger than by the wind.  

Thankfully, as Shawn Achor quips in “ The Happiness Advantage ( THA review + notes), sabre-toothed tigers no longer stalk our office parks.  That does wonders for our life expectancy – but, as anyone who understands the trait adaptivity model will point out, it also means that our storytelling tendency faces dramatically different circumstances than the ones it evolved in.

We’re still those babies looking for those human hands pulling the strings behind the screen, even in situations where it makes no sense: as Phil Rosenzweig points out in “ The Halo Effect ( Halo review + notes):

“Maria Bartiromo can’t exactly look into the camera and say that the Dow is down half a percent today because of random Brownian motion.”

The Halo Effect is a phenomenal (and undercited) book about our tendency to “connect the winning dots” – to, in other words, look for reasons – stories – where none may exist.  The broader point of Rosenzweig’s book is that much of what we attribute to skill is actually luck; that’s not to say that skill doesn’t exist – or that, as some dumb philosophers would try to posit, that causality doesn’t exist.  

Causality / causation and  skill both do, in fact, exist, and they’re both tremendously important – there’s no two ways about it.  However, the impact of luck is bigger than we often like to admit.

Unfortunately, it’s not satisfying (without training) to do what Feynman recommends and live “not knowing” – or have “approximate degrees of belief.”  Instead, as Rosenzweig puts it:

“We don’t want to read just that Lego’s sales were sharply down; we want an explanation of what happened.  It can’t just have been bad luck – there must have been some reason why a proud company… suddenly did so badly.”  

A closely-related concept that Rosenzweig explores is association bias, or what he calls “The Halo Effect.”  We tend to draw overly broad and strong conclusions from narrow data points.

It’s why we “shoot the messenger” – their message is bad and we don’t like it, and therefore the messenger is bad and we don’t like them.

Rosenzweig points out examples of army officers believing that recruits with neater uniforms shoot straighter.  Several books I’ve read have pointed out that political candidates who look more competent than their competitors tend to have a strong advantage in elections.  

And in “ Misbehaving ( M review + notes), Richard Thaler points out that study subjects who are provided information about a fictional student’s sense of humor make just as confident guesses on that fictional student’s GPA as they do when provided with the fictional student’s decile of GPA – a completely ludicrous example of overconfidence.

It’s important to know that this tendency is hardwired into our brains.  Laurence Gonzales, in Surviving Survival, (SvSv review + notes), presents a more novel take on storytelling and the “narrator” than the old chicken-claw story.  

Here, a lady whose brain was electrically stimulated to cry and assume a position of grief started explaining reasons why her life was hopeless.

In another experiment, a woman whose funny bones were stimulated came up with nonsensical explanations for why she was laughing…

… and for good measure, the surgeons started laughing too (social proof).

Storytelling Inversion: Counterfactuals, Probabilistic Thinking, and Expected Value Mental Models

What’s the antidote to storytelling?

It’s thinking the opposite way.  Or, inversion – a concept many readers are probably familiar with.

Storytelling is the process of “connecting the winning dots” – X happened, then Y happened, and then Z happened, so X must have led to Y which must have led to Z.

Probabilistic thinking requires, to some degree, thinking the other way – Z happened, but how else could we have gotten to Z?  Or, where else could X and Y have led besides Z?

While I’m here drawing on some of the concepts discussed in the inversion model (like survivorship bias) and the luck vs. skill model ( process vs. outcome), the idea, generally, is that it’s important to think in probabilities and generate counterfactuals.  The aforementioned “ The Halo Effect ( Halo review + notes) touches on this, as I explore in the process vs. outcome section of the luck vs. skillmodel.

You can go read that there; I won’t duplicate it here.  Meanwhile, Dr. Jerome Groopman’s “ How Doctors Think ( HDT review + notes) provides some great examples of this sort of thinking.  Cardiologist Dr. James Lock – one of my favorite doctors cited in Groopman’s book – explains:

Epistemology, the nature of knowing, is key in my field.  What we know is based on only a modest level of understanding.  If you carry that truth around with you, you are instantaneously ready to challenge what you think you know the minute you see anything that suggests it might not be right.”

This is an example of probabilistic thinking: if Lock thinks he has a diagnosis, he realizes that it isn’t a certain diagnosis, but rather a probabilistic one – the one that’s most likely supported by the evidence.

Another doctor, Terry Light, explains why you have to use probabilistic thinking to interpret MRI results:

“MRIs […] find abnormalities in everybody.  More often than not, I am stuck trying to figure out whether the MRI abnormality is responsible for the pain.  That is the really hard part.”

Groopman goes on to explore how radiologists use probabilistic thinking.

A closely related concept is the idea of a counterfactual – an alternative hypothesis, or “story,” that could be derived from the same data, or that could have resulted in a slightly different world.

A real example from Dr. Groopman’s book: one doctor, Harrison Alter, recommends:

“even when I think I have the answer, to generate a short list of alternatives.”  

This stemmed from a case in which a patient presented with a low-grade fever and rapid breathing, symptoms that were consistent with a viral infection that was going around.  

Falling prey to recency bias, Alter went with that diagnosis – and it turned out that, actually, the woman was suffering from aspirin poisoning, which was another possibility consistent with her symptoms that Dr. Alter had not considered.

It’s not just doctors who forget to think probabilistically sometimes: Dr. Lock, later in the book, notes that patients ““instinctively latch on to certainty” when faced with uncertainty.  

Probabilistic thinking is a very unintuitive way of approaching the world, but the correct one.  Philip Tetlock, in “Superforecasting” (SF review + notes) – a great book about how ordinary individuals like you and I can use a specific process based on probabilistic thinking and disaggregation to outperform experts at predictions in their own fields – notes that, unlike Truman’s favored economists:

“superforecasters often have more than two hands.”

Indeed, while it’s often stated that one contradictory piece of evidence is enough to invalidate theory, that’s just as unrealistic a view as stating that one piece of evidence is enough to prove a theory.  Given luckcomplexity, and the limits of our knowledge and predictability, there will always be counterexamples.

As I discuss in the rationality model, Feynman notes that the thing that doesn’t fit is the thing that’s the most interesting.”  But for now, here’s a bit from Thomas Kuhn’s The Structure of Scientific Revolutions on probabilistic thinking:

If any and every failure to fit were ground for theory rejection, all theories ought to be rejected at all times. - Thomas Kuhn Click To Tweet

I explore probabilistic thinking in some more depth in the process vs. outcome section of the luckvs. skill mental model, the Bayesian reasoning mental model, and the incentives section of the scientific thinking / overconfidence model.

One of the most common applications of probabilistic thinking is expected value – the ex-ante (beforehand) calculation of what you can expect from making a given decision, on average, over the long term.

As Jordan Ellenberg points out in “ How Not To Be Wrong ( HNW review + notes), – another great book covering probabilistic thinking, by the way – expected value is that it isn’t actually what you expect – in many cases, it’s impossible to actually get the expected value.  

The lottery-ticket example is helpful: oversimplifying, either you win the lottery (worth some positive amount) or you don’t win anything (worth some negative amount).  

On average, buying a $1 lottery ticket might lose you $0.60 or whatever, but you’re never actually going to lose $0.60 – but as with the earlier Law of Large Numbers, the expected value is what actual observable results would trend toward over time if given enough iterations (a large enough sample size).

Take this expected value concept and apply it to Megan McArdle’s discussion of handwashing in the  process vs. outcome section of the  luck vs. skill mental model.  What’s your takeaway?

Application / impact: the dangers of storytelling can be avoided by viewing the world in a more probabilistic way – looking for other explanations for events that have happened, or thinking about other ways things could have turned out.

Storytelling x Trait Adaptivity x Dose-Dependency: When and How Should We Tell Stories?

Before we proceed to one of the most important applications of probabilistic thinking, it’s worth pausing for a moment to think about the situations in which storytelling is actually adaptive – or, put differently, the situations in which storytelling is more good than bad.

First of all, it’s a clear fact that people respond better to stories.  They’re more salient and they stick in our memory better.  Evidence of this is everywhere.  Shawn Achor cites some research in “ Before Happiness ( BH review + notes) finding that employees were much more motivated by hearing a story told by one person they’d helped than by being shown reams of statistics on all the people they’d helped.

Similarly, in the aforementioned “ Misbehaving ( M review + notes), Richard Thaler notes the difference between “statistical” and “identified” lives – if lives can be identified, with a story attached, our empathy opens up our purse-strings.  Rarely, as Thaler points out, do we let trapped miners die as we watch them on 24/7 media coverage – but equally rarely are we willing to spend any money to save “statistical” lives by investing in things like malaria nets or vaccines.

What’s the difference?  The miners have a story.  The unidentified people dying from malaria and polio, they don’t.  They’re just numbers. One death is a tragedy; a million deaths are a statistic.

It’s important to realize that given our natural affinity for stories – evident from kids’ love of bedtime stories through adults getting hooked on compelling narratives like Breaking Bad – we can’t, and shouldn’t, do without them.  Not only would life be a lot less rich without the joy of stories, but we’d learn less effectively, too.

Indeed, even Rosenzweig – despite his screed against “connecting the winning dots” – acknowledges that stories have a use.  As he puts it toward the end of “ The Halo Effect ( Halo review + notes):

“the test of a good story is not whether it is entirely, fully, scientifically accurate – by definition it won’t be.  

Rather, the test of a good story is whether it leads us toward valuable insights, if it inspires us toward helpful action, at least most of the time.”

Historian John Lewis Gaddis makes very similar points throughout “ The Landscape of History ( LandH review + notes).  He observes that metaphors enable empathy – since we can’t literally be there and fully understand what it was like, we have to use metaphors to get closer.

He himself uses many metaphors and stories in the book that are helpful, ranging from recurring mentions of the (literally) ass-backward historian peering through a sea of fog, to quips about Napoleon’s underwear (the latter as a metaphor for details that don’t matter.)

The best books – and the ones I learn the most from – are the ones that are good stories above all.  Thaler’s “ Misbehaving ( M review + notes) is a great story, in addition to being a phenomenal exploration of cognitive biases and behavioral economics.

David Oshinsky’s “ Polio: An American Story ( PaaS review + notes) is a phenomenal story, and also a great science-history book (my current favorite in that category).  Yet Oshinsky’s “ Bellevue ( BV review + notes) – perhaps written better on a sentence level, and even more informative – is less interesting and fascinating (although still solid.)

Why?  It’s not a great story.  “Polio” is.

Application / takeaway: think probabilistically, but convert the probabilistic outcome into a story that you – and others you communicate with – will better remember.  Metaphors, analogies, and poster children are more memorable than dry statistics and dull theory.

Probabilistic Thinking Rationality x Schema x Scientific Thinking Commitment Bias x Cognitive Behavioral Therapy x Trait Adaptivity x Contrast Bias

This section may be one of the most important sub-models on the site.

If you’re still here, I assume you’re familiar with the idea of mental models and rationality – the idea that we’re building a latticework of tried-and-true concepts that act as a filter – that becomes our schema – through which we perceive the world.  By automatically filtering reality through mental models, we arrive at more adaptive conclusions without even having to exert any effort, because cognition has been transmuted into intuition via habit.

One problem, however, is that even people who think probabilistically often fail to take Tetlock’s most important piece of advice in “ Superforecasting ( SF review + notes):

For superforecasters, beliefs are hypotheses to be tested, not treasures to be guarded. - Philip Tetlock Click To Tweet

Let’s return to the fuller version of the Feynman quote from “ The Pleasure of Finding Things Out ( PFTO review + notes) that I referenced at the beginning of this model.

“I can live with doubt and uncertainty and not knowing.  I think it’s much more interesting to live not knowing than to have answers which might be wrong.

I have approximate answers and possible beliefs and different degrees of certainty about different things, but I’m not absolutely sure of anything and there are many things I don’t know anything about, such as whether it means anything to ask why we’re here, and what the question might mean.  […]

I don’t feel frightened by not knowing things […] it doesn’t frighten me.”  

otice that Feynman is talking about beliefs in the grand scheme of things – not just on little things.  And this is important. Page 56 of Poor Charlie’s Almanack ( PCA review + notes) contains in the sidebar a nice quote on commitment bias:

Faced with the choice between changing one’s mind and proving there is no need to do so, almost everyone gets busy on the proof. - John Kenneth Galbraith Click To Tweet

Charlie Munger’s commentary thereon?

“If Berkshire has made a modest progress, a good deal of it is because Warren and I are very good at destroying our own best-loved ideas…

when a better tool (idea or approach) comes along, what could be better than to swap it for your old, less useful tool?  

Warren and I routinely do this, but most people, as Galbraith says, forever cling to their old, less useful tools.”

Or, as philosopher W. V. O Quine put it (via Jordan Ellenberg’s aforementioned “ How Not To Be Wrong – HNW review + notes):

“To believe something is to believe that it is true; therefore a reasonable person believes each of his beliefs to be true; yet experience has taught him to expect that some of his beliefs, he knows not which, will turn out to be false.  

A reasonable person believes, in short, that each of his beliefs is true and that some of them are false.”

Of course, many people are wholly unreasonable in this context, as Munger points out.

It’s not hard to see why: many people build their identities around their beliefs; many authors use the metaphor of Jenga blocks: pull out one of the core ones at the bottom, and you risk your entire identity crumbling down around it.  Indeed, many of my friends who grew up religious have had this experience when they eventually decided their parents’ faith wasn’t for them.

So how do we deal with this?  You won’t find much help if you search for it.  Most of the usual answers amount to “willpower – which, as frequent readers of this site know, is the worst and least effective way to do anything.

One underlooked but powerful technique is called cognitive behavioral therapy, or CBT.  While it’s typically viewed in a mental health context – it’s found to be as or more effective than medication, in many circumstances, for anxiety and depression – there’s nothing about the process that renders it specific to mental health issues.  In other words, the techniques of cognitive behavioral therapy are no more confined to mental health than the principle of margin of safety is confined to structural engineering.

The stigma of mental health might be one reason why the CBT approach hasn’t spread more widely.  In any event, what is CBT?

As Dr. Judith Beck – practicing therapist and daughter of the field’s founder. Dr. Aaron Beck – explains in her textbook “ Cognitive Behavior Therapy ( CBT review + notes), it’s a process of “collaborative empiricism” between therapist and patient by which:

“dysfunctional beliefs can be unlearned, and more reality-based and functional new beliefs can be developed”

It’s also something that intelligent and thoughtful individuals – like all Poor Ash’s Almanack readers – are capable of working through themselves.

This discussion will be very brief – Beck’s book provides all the explanation you’ll ever need – but basically, at the core of CBT is the idea of probabilistic thinking.  Beck notes that with regard to our core beliefs – and the “automatic thoughts” that flow from them –

“you most likely accept them uncritically […] you don’t even think of questioning them.”

That is clearly non-probabilistic thinking.  

The better approach is Dan Harris’s realization in “ 10% Happier” ( 10H review + notes):

“weren’t irrational, but they weren’t necessarily true […]I was able to see my thoughts for what they were:

just thoughts, with no concrete reality.”

Three specific techniques from Beck’s “ Cognitive Behavior Therapy ( CBT review + notes) are worth calling out.  The first technique is direct probabilistic thinking. Rather than accepting our thoughts uncritically, Beck notes that a thought should be evaluated as:

100% true, 0% true, or someplace in the middle.”

You start with thoughts – and not beliefs – because as Beck explains, it’s:

“easier for patients to recognize the distortion in their specific thoughts than in their broad understandings of themselves

How do you do this?  The second technique is that “empiricism” that Beck references: the idea of a/b testing and counterfactuals.  Start identifying counterexamples that call into question your beliefs and thoughts.  Think you’re not a good student? Then why is your GPA so high?

You can also go out and test new beliefs with a low opportunity cost by acting “as if” you believed something (even if you don’t.)  Say you have the belief of “I’m not funny or interesting enough to ask out that nice girl.”  Well, the only way you’ll ever change that belief to a more adaptive one is to process the world through the alternative belief for a little while – i.e., act as if you’re funny and interesting enough to ask out that nice girl – and see what happens.

Finally, it’s worth noting that the most “true” belief isn’t always the most “ adaptive” one.  I’m fond of a Megan McArdle quip from “ The Up Side of Down ( UpD review + notes):

There’s a scientific name for people with an especially accurate perception of how talented, attractive, and popular they are - we call them ‘clinically depressed. - Megan McArdle Click To Tweet

Beck would hate that quote, but it drives home the idea that all beliefs are traits that are adaptive in some circumstances and not others.

For example, take overconfidence.  If you’re going to a job interview or a date, then overconfidence is totally adaptive.  

In other situations, overconfidence is not adaptive – instead, a margin of safety is – we should assume we’re less competent than we actually are.  When hiking, for example, overconfidence can be deadly.  When investing, overconfidence can be equally dangerous.

And certainly, given that 95% of all drivers think they’re above average, and car collisions are a leading cause of injury and fatality, it’s probably adaptive to believe we’re below-average drivers.  I carry this belief around with me, assuming I don’t have very good reflexes or great training in terms of collision avoidance, and it leads me to be cautious.

I’m probably, in fact, just an above-average driver by inversion: I don’t drink and drive (I don’t drink, period), I don’t text and drive, I avoid driving during heavy traffic or adverse weather, I never run red lights, and I rarely go more than 5 – 7 miles above the speed limit.

But believing that I’m an above-average driver would probably, thanks to n-order impacts, make me drive more aggressively – an undesirable outcome.  So the “true” belief isn’t, in fact, the most adaptive one – because we’re humans, not econs.

The third technique follows naturally from the first two. Of course, over time, if you decide more and more of your thoughts are untrue, you’ll start to (naturally) question the underlying core beliefs driving those thoughts.

Again, this approach of a/b testing beliefs is dead simple: many people just never think to do it.  For example, if you don’t believe in the value of sleep– even after reading Dr. Matthew Walker’s “ Why We Sleep ( Sleep review + notes) – then find a not-so-busy week and just try sleeping an extra hour a night.  It doesn’t cost you much, and maybe you’ll learn something.  If you believe that reading the news and watching stock prices is critical to doing your job well, just try not doing it for a few days and see if you’re getting more done.

This probabilistic approach also works so well for belief modification because, again, it exploits the concept of contrast bias / “just noticeable differences.”  It’s probably harder to go from “I’m incompetent” to “I’m awesome” in one step, but it’s much easier to go from “there is 0% chance that studying for this specific test would help” to “there is a 20% chance that studying for this specific test would help.”  And so on.

Application / impact: beliefs, as well as individual situations, should be evaluated probabilistically.  When the preponderance of the evidence suggests another belief is more appropriate or adaptive, stop believing what you believe now, and start believing that instead, via the CBT process of empiricism, trait adaptivity, and probabilistic thinking.