Philip Tetlock’s “Superforecasting”: Book Review, Notes + Analysis

Poor Ash’s Almanack > Book Reviews > Effective Thinking

Overall Rating: ★★★★★★ (6/7) (standout for its category)

Per-Hour Learning Potential / Utility: ★★★★★★ (6/7)

Readability: ★★★★★★ (6/7)

Challenge Level: 2/5 (Easy) | ~ 280 pages ex-notes (352 official)

Blurb/Description: Why do our forecasts generally fail, and how does a select group of “superforecasters” predictably do better?  Phil Tetlock provides a thorough and engaging analysis.

Summary: Superforecasting is one of those books I’d never read until surprisingly recently because I figured I’d “read it by proxy” – the major concepts have diffused so far into the book-o-sphere that I didn’t think I’d get much out of reading it.

That was a directionally accurate assessment; I certainly didn’t come away with any paradigm-shifting insights, but the book was meaningfully more thought-provoking (for me personally) than I thought it would be.  And, of course, there’s a path-dependency element here, wherein Superforecasting is being judged against the dozens of other books I’ve read that touch on similar topics.  

How does it stack up?  It is unequivocally better than many of those.  

It is one thing to recognize the limits on predictability, and quite another to dismiss all prediction as an exercise in futility. - Philip Tetlock Click To Tweet

In fact, if I could go back and do it over again, Superforecasting would replace a number of those books, because it punches home a lot of important mental models in a reasonably compact way, and Superforecasting will become one of my “de facto” recommendations.

Highlights: Like Nate Silver, Tetlock does a solid job of being pretty balanced and reasonable, not treating forecasting as some ivory-tower discipline where aesthetics matter more than utility.  He’s also very multidisciplinary, in the sense that he touches on the ways that forces ranging from incentives to social proofcan influence our forecasts.  

Ultimately, with only a few minor caveats touched upon below, this should be any reader’s go-to introduction to forecasting.  It is a book that bears solutions, not just problems.  Tetlock is optimistic that we can do better, and as such makes a good teacher.

Beyond all of the above, Tetlock gets major brownie points for an extraordinarily rare and concomitantly joyful surprise.  It’s always annoyed me that so many smart, thoughtful writers and commentators – even luminaries like Michael Mauboussin and Howard Marks – mindlessly idol-worship Daniel Kahneman and Nassim Taleb, including those authors’ pessimism-bordering-on-nihilism-masquerading-as-intellectualism, without so much as a consideration that not all of their views are correct, or in the case of Taleb, even perhaps worthwhile or whatsoever insightful.  I was starting to think I was the only person in the value investing world willing to say publicly that Kahneman and Taleb, for whatever their merits in certain respects, get the most important things really badly wrong.  

It turns out I’m not alone: while he’s definitely more complimentary of both Kahneman and Taleb than I am, Tetlock provides a more balanced and Munger-like view that’s based on research rather than ideology, citing the reasonable parts of Kahneman/Taleb’s worldviews but coming to the more appropriate, realistic, optimistic conclusions.  See the notes on pages ABC and XYZ.

Lowlights: There’s not much to not like about this book.  Some of the anecdotes do drag on a bit long and parts of the book can be a bit repetitive, but it’s not egregious.  Also, he (sigh) cites Grit, and you know how I feel about grit: it’s stupid.  See the  willpower mental model.

Those are mostly irrelevant, though.  The bigger challenge, in my view, is that readers may overinterpret some of Tetlock’s conclusions in a manner that isn’t really his fault, similar to how many people wildly misinterpret Atul Gawande’s The Checklist Manifesto (TCM review+ notes).  Tetlock tells much of the book through the lens of the “Good Judgment Project,” which studied people’s ability to make accurate, time-constrained forecasts about certain events happening or not, ranging from third-world politics to sea ice measurements in the Arctic.

This leads Tetlock to some conclusions – such as stating that forecasts without timelines are absurd”and recommendations for frequent updating and a focus on precision – that run counter to what auseful workflow in investing (and, likely, many other business fields) looks like thanks to opportunity costs and utility.

To use Tetlock’s reference to Steve Ballmer, it probably wasn’t important for Microsoft to have a constant, precisely-updated, very-accurate forecast of the potential size of the smartphone market… but rather just a general understanding of the big opportunity cost and/or direct threat to their business if it took off without them (as it eventually did).  

Tetlock does acknowledge toward the end of the book that some of the important questions can’t be answered or measured as precisely as the GJP questions, and notes as well that in some cases, it’s not really forecast accuracy that matters as much as impact, and false positives and false negatives aren’t always equally bad.  

Therefore, readers need to be careful to take the extra step of “translating” Tetlock’s wonderful and thoughtful discussion of forecast accuracy to the opportunity costs and utility of forecasting in the real world.

Mental Model / ART Thinking Points:  probabilistic thinkingutilityhindsight biasfeedback, control group, confirmation bias, ego, cognition / intuitionhabit / conditioningavailability heuristic,storytellingscientific thinkingnonlinearityluck,   utilityopportunity costschemaprecision vs. accuracyprocess vs. outcomeman with a hammer, absample size, disaggregationbase rateinside viewconditional probabilitiessocial proofnonlinearity,  margin of safety,

You should buy a copy of Superforecasting if: you want a thoughtful discussion, drawing from multiple disciplines, of how to make better forecasts (decisions).

Reading Tips: Feel free to skim paragraphs/pages if it feels like Tetlock is just driving a point home; the book’s anecdotes can get a bit drawn-out and repetitive.  On the other hand, don’t stop after the epilogue; there’s some juicy bits in the endnotes that are enjoyable and additive to the learning.

Pairs Well With:

The Signal and the Noise” by Nate Silver (SigN review + notes) – a more statistics/data-based look at forecasting and how we get it right and wrong.

“How Not To Be Wrong: The Power of Mathematical Thinking” by Jordan Ellenberg (HNW review + notes) – a broader application of mathematical concepts to real life.

The Success Equation” by Michael Mauboussin (TSE review + notes) – the best compact discussion of luck and skill around.

Mistakes were Made (but not by me) by Tavris and Aronson ( MwM review + notes) – a thorough discussion of cognitive dissonance, schema, and confirmation bias.

Case In Point” by Marc Cosentino.  This case-interview preparation guide is a very nontraditional book recommendation, and I have no review because it’s been forever since I’ve worked through it.  But for those interested in the “ MECE” framework Tetlock brings up (which is often used by MBB consultants), this book provides plenty of examples for testing out that sort of structured problem-solving.

Reread Value: 3/5 (Medium)

More Detailed Notes + Analysis (SPOILERS BELOW):

IMPORTANT: the below commentary DOES NOT SUBSTITUTE for READING THE BOOK.  Full stop. This commentary is NOT a comprehensive summary of the lessons of the book, or intended to be comprehensive.  It was primarily created for my own personal reference.

Much of the below will be utterly incomprehensible if you have not read the book, or if you do not have the book on hand to reference.  Even if it was comprehensive, you would be depriving yourself of the vast majority of the learning opportunity by only reading the “Cliff Notes.”  Do so at your own peril.

I provide these notes and analysis for five use cases.  First, they may help you decide which books you should put on your shelf, based on a quick review of some of the ideas discussed.  

Second, as I discuss in the memory mental model, time-delayed re-encoding strengthens memory, and notes can also serve as a “cue” to enhance recall.  However, taking notes is a time consuming process that many busy students and professionals opt out of, so hopefully these notes can serve as a starting point to which you can append your own thoughts, marginalia, insights, etc.

Third, perhaps most importantly of all, I contextualize authors’ points with points from other books that either serve to strengthen, or weaken, the arguments made.  I also point out how specific examples tie in to specific mental models, which you are encouraged to read, thereby enriching your understanding and accelerating your learning.  Combining two and three, I recommend that you read these notes while the book’s still fresh in your mind – after a few days, perhaps.

Fourth, they will hopefully serve as a “discovery mechanism” for further related reading.

Fifth and finally, they will hopefully serve as an index for you to return to at a future point in time, to identify sections of the book worth rereading to help you better address current challenges and opportunities in your life – or to reinterpret and reimagine elements of the book in a light you didn’t see previously because you weren’t familiar with all the other models or books discussed in the third use case.

Page 3: Based primarily (but not exclusively) on research conducted via the Good Judgment Project, Tetlock’s goal in this book is to identify what traits make “superforecasters” – i.e., ordinary individuals who were able to achieve stellar and persistent performance in the GJP.

Pages 5 – 6: Tetlock notes that:

pundits are almost never asked to reconcile what they said with what actually happened”

and eventually became frustrated that his research was being used as a:

“backstop reference for nihilists who see the future as inherently unpredictable and know-nothing populists who insist on preceding ‘expert’ with ‘so-called.’”  

Boom.  *mic drop*

Pages 7 – 9: Tetlock provides a brief discussion of chaos theory (complexity) via the inciting incident of the Arab Spring, noting that current scientific opinion is that:

“there are hard limits on predictability.”  

Geoffrey West’s Scale is a thoughtful read on this topic.  So is Nate Silver’s “ The SIgnal and the Noise ( SigN review + notes) – for example, SIlver discusses historical projections of the flu and population growth.  With a long-enough time horizon, small errors in exponential forecasts go horribly wrong).

Page 10B: Again, Tetlock notes – in direct contrast, as we’ll see later, to basically-nihilists like Taleb – that

“it is one thing to recognize the limits on predictability, and quite another to dismiss all prediction as an exercise in futility.”  

The above illustrates the dose-dependency of intellectual humility and understanding luck.  This book provides solutions, not just problems.  As Peter Thiel says in “ Zero to One ( Z21 review + notes), incorporating agency:

If you expect an indefinite future ruled by randomness, you’ll give up on trying to master it.

Tetlock would likely agree.

Page 13: Tetlock comes back a lot to the theme of “false dichotomies” – his view seems to be that things aren’t either “predictable” or “unpredictable,” but rather shades thereof.   Probabilistic thinking.  

There’s also a clear time element here: for example, the weather tends to be relatively predictable in the short-term, but less so over the medium term and pretty much unpredictable over the long term.  (Again, Silver has a great discussion of meteorology in The Signal and the Noise – the story behind “FOR YEARS YOU’VE BEEN TELLING US THAT RAIN IS GREEN” is one of my favorite things.)

Pages 14 – 15: Tetlock notes again that nonlinearity and chaos theory mean that our ability to improve forecasts of weather may be limited.

See also Jordan Ellenberg’s “How Not To Be Wrong: The Power of Mathematical Thinking” (HNW review + notes).  Ellenberg touches on nonlinearity a lot; his point of view:

You can do linear regression without thinking about whether the phenomenon you’re modeling is actually close to linear.  But you shouldn’t […] the results can be gruesome.

Tetlock also acknowledges – as Silver does in context of the weather – that forecasts aren’t always meant to, well, forecast.  Sometimes they have other purposes, like entertainment, or political rabblerousing.   Utility.

Nate Silver calls it a “cardinal sin” to subsume the accuract of forecast to other interests, but as I discuss in the notes to “ The Signal and the Noise ( SigN review + notes), I disagree.

Pages 18 – 20: Tetlock’s research finds that “superforecasters” tend to be more intelligent and more knowledgeable than the population as a whole, but not dramatically so – i.e., you don’t have to be Richard Feynman to be a superforecaster.  What do you have to do?  You have to learn to process information in a certain way.  

Pages 25, 27: Jordan Ellenberg points out in “How Not To Be Wrong” ( HNW review + notes) that a lot of things that are “obvious” today – like, for example, that old people should be charged less for annuities than young people – were not whatsoever obvious at certain points in history.   Hindsight bias.

Tetlock similarly references the often-gruesome history of medicine, wherein obvious mistakes were repeatedly made thanks to refusal to conduct experiments and, as Peter Godfrey-Smith might put it in Theory and Reality, allow their hypotheses to make contact with the real world. 

See also here David Oshinsky’s “ Bellevue ( BV review + notes), which goes into quite some detail on how, for example, American doctors refused to accept germ theory – making fun of it, even – for quite some time after the scientific evidence on the topic from Europe was reasonably clear.  I discuss in the  salience  mental model how it took the death of President Garfield – not from an assassin’s bullet, but from septicemia introduced by his doctors’ dirty fingers – to change things..

Anyway, the parallel to forecasting is that given that most forecasts (or, more broadly, decisions) aren’t explicitly tracked or measured in a useful way, it’s hard for us (collectively) to learn from our mistakes.  

Hindsight bias, a fundamental problem of memory, is a real problem here – see Tavris/Aronson’s “ Mistakes were Made (but not by me) ( MwM review + notes).  Although I make very few Tetlock-approved forecasts, I’ve definitely found that incorporating decision journaling into my research process provides feedback that helps me evaluate where I’ve gone wrong and how I can do better.

Pages 29 – 30: More on the history of medicine and the surprisingly recent invention of control groups.  

Also, a nice Richard Feynman quote about uncertainty; cross-reference The Pleasure of Finding Things Out (PFTO review + notes), specifically page 24, where Feynman states:

I can live with doubt and uncertainty and not knowing.  I think it’s much more interesting to live not knowing than to have answers which might be wrong.  

I have approximate answers and possible beliefs and different degrees of certainty about different things, but I’m not absolutely sure of anything and there are many things I don’t know anything about, such as whether it means anything to ask why we’re here, and what the question might mean.  

[…] I don’t feel frightened by not knowing things

[…] it doesn’t frighten me.”

Pages 31, 33 – 34: a good example of confirmation bias and ego here.  Also, Tetlock introduces Kahneman’s System 1 / System 2 framework, where System 1 is automatic and System 2 is deliberative/thoughtful (so named because System 1 always comes first).  See cognition / intuition.

Tangentially, I always ace the “one bat + one ball = $1.10” cognitive reflection question, and struggle with Mauboussin’s transitory one about whether or not someone married is looking at someone unmarrried… probably because I was in Mathcounts in middle school and do a lot of algebra as an investor, but have no use for the latter sort of formal deductive logic in my day to day life.  This is an example of  habit / conditioning, which we’ll touch on later.

Page 35: Tetlock references the availability heuristic here…

Page 36: … and tells the famous storytelling story about the shovel and the chicken claw.

Page 38: Some good discussion of scientific thinking here, although of course not all scientists always do this: Tetlock notes that

“scientists are trained to be cautious.  They know that no matter how tempting it is to anoint a pet hypothesis as The Truth, alternative explanations must get a hearing.  

And they must seriously consider the possibility that their initial hunch is wrong […] such scientific caution runs against the grain of human nature […] our natural inclination is to grab on to the first plausible explanation and happily gather supportive evidence without checking its reliability.”

Of course, not all scientists actually think this way.  As Thomas Kuhn notes in “ The Structure of Scientific Revolutions ( Kuhn review + notes):

“Phenomena that will not fit in the box are often not seen at all… [scientists] are often intolerant of [new theories] invented by others.”

Examples of this are all over the place.  My favorite, of course, is in Richard Thaler’s “ Misbehaving ( M review + notes), where classical economists, faced with behavioral economic reality, clung tenaciously to their (completely irrational) rational-actor models.

Page 40: A little quibble with Tetlock here… he attributes Cochrane’s blind acceptance of the doctor’s diagnosis of cancer to availability heuristic.

As with Munger calling hogwash on most psychologists’ failure to properly interpret the Milgram experiment, I think Tetlock is understating things here.

First of all, there’s authority bias, which Tetlock sort of gets at (indirectly), but there’s also simply stress caused cognitive impairment:  Cochrane just woke up from a major surgery and is likely either heavily sedated or in a lot of pain, and he’s just been provided with traumatic news.

To go back to Tavris/Aronson, there’s probably also some cognitive dissonance reduction – he went in for a minor-ish surgery, woke up sans his pectoralis minor, and it’s very hard to reconcile the doctor saying “you have cancer so bad that I had to remove the entire muscle” with the possibility of “I don’t actually have cancer.”

Because otherwise the pectoralis minor would’ve been removed for nothing…

Pages 42 – 43: Tetlock touches on intuition here, and notes (maybe not here, but somewhere, I think) that feedback is an important component – intuition only works when there’s usually clear/immediate feedback.

For example, many high-performing athletes seem to perform better when they think less, but there’s also usually clear feedback.  Did throwing this kind of pass lead to a touchdown or an interception?  Did using this pass-rush move result in me getting a sack, or getting pancaked?

In the stock market, sample sizes are not so large and thus it’s not so easy.  This shows up elsewhere, too; Thaler (and coauthor Cass Sunstein) explore the concept of feedback in “ Nudge ( Ndge review + notes), examining how making it more salient can influence decisionmaking – whether that’s flies on a urinal (for aim) or smiley/frowny faces on your thermostat (for energy consumption).

Similarly, Megan McArdle’s “ The Up Side of Down ( UpD review + notes) provides some excellent examples: one on handwashing (only one in a thousand times does a doctor infect someone because they don’t wash their hands), and the experience of many on probation due to overworked parole officers.  McArdle notes how hard it is to learn from a sequence that looks like:

“nothing… nothing… nothing… nothing… nothing… bam!  Five year prison term.”

To Tetlock’s point, McArdle explores how a Hawaiian system that made the feedback more clear – via scheduled drug tests and check-ins that pretty much ensured you do something bad, you’re punished for it – resulted in dramatically better compliance.

Pages 51 – 52: Tetlock again discusses the sort of ego-protective rationalizing that is analyzed extensively in “ Mistakes were Made (but not by me)  – MwM review + notes – not surprising that many of these relate to  ideology.

He also discusses that “a forecast without a time frame is absurd.”  I don’t wholly agree, for reasons I’ll touch on in more depth later* ((mostly utility and opportunity cost) – but he does make the correct observation that “as time passes, memories fade,” making it harder to evaluate vague forecasts after the fact.

*The general idea here is that in many cases, the time component is actually irrelevant.  For example, say you’re a company considering investing a lot of capital with a long-term payoff and building a big workforce in some politically unstable country where nationalization is a risk.

In this context, a “forecast without a time frame” is not absurd – it doesn’t really matter if nationalization happens in one, three, five, or seven years; if there’s a big risk of nationalization at some point in the future – beyond the near-term that Tetlock thinks is reasonably predictable, but far enough out that chaos theory makes it, per Tetlock’s general worldview, mostly unpredictable – then you shouldn’t waste your time on the project, or should only devote a modest amount of resources toward it.

Page 55: Tetlock brings up the important and valid point – related to schema, though he doesn’t mention it explicitly here (he sort of does later) – that words like “serious possibility” can mean totally opposite things to different people.  Therefore, forecasts should be made in unambiguous terms.

Page 57: Tetlock notes that a risk of expressing a probability estimate with an exact number is that “[it] may imply to the reader that it is an objective fact, not the subjective judgment that it is.”  

Precision vs. accuracy.  Jordan Ellenberg made a similar (mild) criticism of Nate Silver for reporting his estimates down to the decimal point when, in Ellenberg’s view, the data didn’t warrant that level of confidence and decimal points were meaningless.

Page 58: Great note here on process vs. outcome – as discussed elsewhere, in, for example, Michael Mauboussin’s The Success Equation ( TSE review + notes)and Howard Marks’ The Most Important Thing (Illuminated), one of the challenges with inherently probabilistic outcomes is that getting a good outcome doesn’t tell you whether or not you made a good decision (or, in Tetlock’s context, a good forecast).

Tetlock (like Nate Silver, and to some extent Jordan Ellenberg) notes that you (for better or worse) can’t run 100 simulations of the next 100 years to see how many end in nuclear war (the only way to win is not to play).  

Pages 64 – 65: Tetlock brings up the concept of “Brier scores,” which measure the distance between [the] forecast and what actually happened.”  

They can range from 0 to 2; 0 is best; 0.5 is random guessing, and 2.0 is being wrong all the time.  He also notes, on relative vs. absolute skill, that you need to benchmark: correctly predicting that the weather in Phoenix on any given August day will be “hot and sunny” does not make you a skillful forecaster.

This “overly general” approach to forecasting shows up elsewhere: see, for example, Dr. Matthew Walker on why Freudian dream analysis is bullshit in the phenomenally important “ Why We Sleep ( Sleep review + notes).

Page 68: the results showed that expertise was less important than process

Page 69: hey look, it’s the famous “hedgehog” (knows one big thing) and “fox” (knows a little about a lot).  This analogy has been beaten to death, but the summary is that foxes win… mainly because hedgehogs are man with a hammer types who try to interpret everything through their ideology-biased schema, leading to confirmation bias.

Page 71: Here’s where Tetlock brings up the idea of schema somewhat explicitly, in the context of hedgehogs – he doesn’t use the word, but he does use the analogy of “green tinted glasses that the hedgehog “never takes off,” causing them to see green all the time, “whether it’s there or not.”

Page 72: Why aren’t there more foxes around?  Because confidence is sexy. Nobody wants a wishy-washy “many-handed economist,” as the joke goes – some people want pundits on Fox News shouting loudly about “liberals this, liberals that,” and other people equally want pundits on MSNBC screaming about “Republicans this, Republicans that.”  

Some calm, mild-mannered, thoughtful person saying “well, the new [XYZ] bill has some positive elements that could result in [ABC]. but on the other hand…”

In other words, there are  incentives for  overconfidence.

Pages 73 – 74: How does the “wisdom of the crowd” work?  It’s not magic; Tetlock describes it somewhat similarly to control groups described earlier – individual variations (i.e. biases) cancel out, leaving only the “signal” (assuming people have useful information).

Of course, this doesn’t help when most people are wrong, or when most people are biased in one direction rather than the other…

Pages 75 – 76: Tetlock discusses the “two thirds of everyone else’s guess” challenge.  See Thaler’s “Misbehaving” ( M review + notes).  Classic humans vs econs.

Pages 78 – 79: Here is the best example of schema in the book, provided through one of my least favorite topics (poker).  Even strong poker players often, according to elite professional poker player Annie Duke, fail to think about how they’d play a hand if they were in the other person’s shoes.

Page 83: Back to process vs. outcome again… basically a repetition of page 58, the challenges of hindsight bias, etc.

Page 87: Tetlock notes that process can’t be evaluated completely independent of accuracy.

Page 91: I thought this “extremizing” bit was interesting… not something I’d heard of before.  

Page 93: Again, I’d highlight/caution that the one “caveat” to this book is that it focuses on a fairly narrow use case that isn’t necessarily broadly applicable, and as such it’s important to “translate” its views.

For example, here, Tetlock discusses the accuracy of one prediction, on: “Will Italy restructure or default on its debt by 31 December 2011?”  This obviously has big implications if, say, your job is to price short-term credit default swaps on Italian debt, but for a generalist, it’s probably less important if Italy defaults on its debt by December 31st, or if it defaults on it at all.  

Moreover, the utility of making such a prediction precisely has to be weighed against the opportunity cost of using that time to make less precise, but more useful predictions about other things – for example, if the yield to maturity on Italian debt is only modestly higher than the yield to maturity on some other debt elsewhere that you feel has a much lower chance of defaulting (without a lot of analysis required), then why bother with the Italian debt at all?

This isn’t to say that Tetlock’s work doesn’t have value: there are many situations in which this sort of forecasting might be useful.  And the takeaways on how to think are, generally, very applicable and positive.  But again with schema, making the most accurate and precise forecasts possible isn’t always the best thing to do; as Jordan Ellenberg notes wittily in “ How Not To Be Wrong” ( HNW review + notes), it’s not always wrong to be wrong!

Page 97: Tetlock here notes the “improbable things are probable” phenomenon (a subset of sample size); see also Ellenberg and Mauboussin.  Tetlock also cites research from Ellen Langer (yay Langer), who finds that even really smart students have an illusion of control in luck-driven situations.

Pages 98 – 99: Tetlock goes further into the “improbable things are probable” idea by pointing out how some people (or companies) get famous for performance that is more or less luck.  In the endnotes, he cites Phil Rosenzweig’s excellent The Halo Effect (Halo review + notes), which you should go read.

Many value investors would cite famous investors like John Paulson or Kyle Bass in this category, given their big wins on the financial crisis and terrible performance thereafter – lucky, not smart.  

(I’d put Taleb in this category too, with a bit of Mauboussin’s Music Lab flavor – I’ve never found Taleb particularly insightful; I believe he just happened to catch the zeitgeist with some provocatively phrased but intellectually mediocre writings, and social proof did the rest, and now he’s quoted ad nauseam despite not being very interesting, and not being very realistic or useful, as I get to later, as does Tetlock in a less critical manner.)

Page 107: Tetlock here conceives of IQ as a genetic lottery, which is not precisely true given that it’s not perfectly correlated over our lifespan, and given that there are a few fairly empirical things (like exercising and continuous learning) that can reliably boost it.  See Stuart Ritchie’s Intelligence (review + notes) for more on this.  Also see my more nuanced commentary on this topic in the review to “ Zero to One ( Z21 review + notes).

Page 109: As referenced, superforecasters were smarter and more knowledgeable than the general population, but not astonishingly so.

Page 111: Tetlock here brings up the brain-teaser: how many piano tuners are there in Chicago?”  Questions in the same vein, like “how many tennis balls can you fit in a 747,” or “how many pairs of briefs are sold in the U.S. each year,” or “how many gas stations are in Canada” are classic interview questions at strategy consulting firms McKinsey/Bain/BCG.  It’s a great example of  disaggregation

It’s not about arriving at the right answer per-se, given that you may not have remotely accurate data, but rather about structuring the problem correctly so that if you had the right data, you could solve it accurately.  I.e. – there are probably fewer gas stations per person in urban areas than in rural areas, and {X, Y, Z} people, roughly, live in {rural, suburban, urban} areas, and the relative intensity of gas stations per capita is {A, B, C}, so the total number should be XA + YB + ZC…

What’s intriguing is that the “ideal” forecasting process that Tetlock goes into incorporates a lot of elements of what consultants call “MECE” (pronounced meeee seeee) – mutually exclusive, collectively exhaustive – basically building a decision tree to analyze all possible scenarios at some level of granularity.  This sort of rigorous thinking pops up in plenty of non-forecasting arenas; Sam Hinkie seems to have used it quite a bit, from what I can tell (one of his former colleagues at Bain is a friend of mine who thinks in a very similar, structured fashion).

Books like Case in Point and resources like Victor Cheng’s website might be helpful into diving deeper into this sort of thinking; both were helpful for me long ago when I thought I wanted to be a MBB consultant… (thankfully I didn’t go that route!)

Page 116: Here is an example of that above-described process: how likely is it that a research lab will find traces of polonium in a long-dead corpse?  Well, setting aside whether or not the guy was poisoned with polonium, the first question to ask is whether polonium is even still detectable this far after the fact.

Pages 117 – 118!: Tetlock brings up the concept of the base rate and inside view / outside view: while it’s compelling to use storytelling to weave together a compelling bottom-up narrative, top-down realities are often the best starting point.

Page 119: Tetlock brings up the important idea of conditional probabilities here; see also Silver, Ellenberg, etc.   This concept doesn’t get as much airtime as base rates, but is important – as Tetlock points out in this case, the correct base rate for pet ownership for a described family isn’t universal U.S. pet ownership, but rather pet ownership in single-family homes…

Page 120: anchoring is discussed here.  It’s real; I notice it all the time.

Page 121: more examples of how MECE works (my term, not Tetlock’s)

Page 123: Tetlock here discusses how, even without a team, one of the “superforecasters” uses the process of writing things down to distance himself from his own thoughts and perspectives.

Page 124: Tetlock notes that superforecasters often use “on the one hand / on the other hand” type discussion, concluding that “superforecasters often have more than two hands.” 

 Probabilistic thinking.

Page 125: Tetlock hypothesizes that superforecasters have high need for cognition and high “openness to experience.”  This shouldn’t be surprising directionally. But…

Interestingly, and tangentially, I think that “need for cognition” is dose-dependent… more is not always better.  One of the highest-potential aspiring investors I ever met had very high need forcognition, but ended up (last we talked) having little to no hope of a successful future in investing because he viewed the concept of utility as beneath him.

One of the traits that correlates to being a really successful investor – and, probably, business executive as well – is that there are “no points for difficulty.”  Solving hard problems is great for the world, but not necessarily for the solver, to the extent that easier problems have a similar or higher payoff with less opportunity cost.  One thing that sinks a lot of investors is solving problems that have non-financial payoffs – i.e. ego, glory, intellectual challenge/interest, whatever – when the boring “two foot hurdles” are the easiest way to get ahead in both life and investing.

So, is some need for cognition better than none?  Yes, certainly. Is a lot better than some? That’s more of an open question.  As noted in various places (Feynman, Ellenberg, etc), people with a really high need for cognition seem to wander off into solving pointless problems that are very intellectually stimulating but often not terribly useful.

Page 126: Tetlock provides a great discussion of “active open-mindedness,” which is pretty much the opposite of walking around with confirmation bias staining your schema.

Page 127: I will take Tetlock’s slogan and run with it.  On probabilistic thinkingoverconfidence,ideology, etc:

For superforecasters, beliefs are hypotheses to be tested, not treasures to be guarded. - Philip Tetlock Click To Tweet

I will make sure to quote it in a few places to spread the word…!

Page 129: Tetlock notes that most superforecasters are comfortable with numbers, but very rarely do advanced math.

Pages 133 – 134: comparing the movie Zero Dark Thirty to the real world, Tetlock notes that what makes good TV/entertainment (a lot of confidence!) doesn’t make good decisions.  There’s an incentives issue here too that’s discussed elsewhere in the book: most organizations tend not to value many-handed superforecasters!

Pages 138 – 139: Here is the punchline about utility, many-handed economists, etc.  Tetlock notes that:

“people equate confidence and competence[…]

one study noted, ‘people took such judgments as indications that the forecasters were either generally incompetent, ignorant of the facts in a given case, or lazy, unwilling to expend the effort required to gather information that would justify greater confidence.’”

Ouch.  Sad, but true.   Utility incentives overconfidence probabilistic thinking product vs. packaging all interact here.

Page 143B: Tetlock here differentiates between “epistemic” and “aleatory” uncertainty.  I am pretty certain I am not going to remember the names of them, but “epistemic” means something is knowable but unknown, while “aleatory” is the famous “unknown unknown.”  Kind of.

Pages 144 – 146???: Okay, so in the interest of being “actively open-minded,” here is the first (and only!) piece of disconfirming evidence I’ve seen to my general thesis on precision vs. accuracy.  Tetlock notes that

“ordinary forecasters were not usually that precise.  Instead, they tended to stick to the tens […] 30% likely, or 40%, but not 35%, much less 37%.

 Superforecasters were much more granular […] the tournament data [… shows…] that granularity predicts accuracy.”  

As I said, this is disconfirming evidence of my generally anti-precision belief, and contrasts with, for example, Jordan Ellenberg’s aforementioned criticism of Nate Silver’s .1% precision on his data.

It’s difficult to know how to interpret this data point.  My inclination (which, of course, risks confirmation bias) is to call it a case of correlation vs. causation.  Here is my compelling, coherent narrative (which may of course be nonsense).  Tetlock has noted that “superforecasters” tend to take the BCG “ MECE” approach and build a decision tree and come up with a lot of additive conditional-probability scenario-weighted answers; i.e., here’s an example:

The probability of polonium being detectable is 70%, the probability of it not is 30%

IF polonium is detectable, the probability of Israel having used it is 60% and the probability of them not having used it is 40%.

IF Israel used it, the probability of polonium still being detectable is 85%.

So the total probability is 0.7 x 0.6 x 0.85 = 35.7% (call it 36%).

Those are dumb fake numbers for illustration.  The point is that the process of building a decision tree sort of necessarily leads you to single-digit, or even decimal-point, precision.  

That doesn’t mean, of course, that shooting for decimal-point precision – for example, in a valuation model, by trying to forecast every variable precisely over a long time horizon – is going to get you any closer to the right result.  It’s just that in this case, precision was a function of the right thinking process, so it ends up being correlated with the right answer, but it’s not the cause.

I will clean this discussion up a bit and put it in the precision vs. accuracy or correlation vs. causation mental model.

Here is what I can say without equivocating: Tetlock’s citation of Charlie Munger on page 146, with the famous quote about innumeracy making you into a one-legged man in an ass-kicking contest, is absurd and misleading.  Tetlock makes it out to sound like Munger supports precision, leading the paragraph with

“Most people never attempt to be as precise as Brian […] that is a serious mistake.  As the legendary investor Charlie Munger sagely observed…”  

Uh.  What?  No, silly, Munger and Buffett have made a career out of napkin math and you can just go google quotes from both of them if you want to know what they think about precision vs. accuracy.

Page 150B: oooh here’s a fun one: Tetlock goes after philosophy here.  

“A probabilistic thinker will be less distracted by “why” questions and focus on “how.”  This is no semantic quibble. “Why?” directs us to metaphysics; “How?” sticks with physics.”  

I’m not really sure I agree with Tetlock that “why” questions aren’t useful.  For example, Benjamin Franklin is quoted in the (terrible) “Benjamin Franklin: An American Life” ( BFaAL review + notes) as taking the following perspective on scientific endeavors:

“science should be pursued initially for pure fascination and curiosity, and then practical uses would eventually flow from what was discovered.”  

[…]

“pave the way to some discoveries in natural philosophy of which at present we have no conception… important consequences that no one can foresee.”  

[…]

“It does not seem to me a good reason to decline prosecuting a new experiment which apparently increases the power of man over matter until we can see to what use that power may be applied.  

When we have learned to manage it, we may hope some time or other to find uses for it, as men have done for magnetism or electricity, of which the first experiments were mere matters of amusement.”

But, with that caveat, I generally agree with Tetlock’s general idea that some kinds of knowledge are more useful than others.  Back to what I was saying about need for cognition above – more is not always better dose-dependency); it really doesn’t matter if we’re all in a simulation.  Nobody cares. Either way we still have to live in it.   Utility.

Page 152: Believing in fate makes you a bad forecaster.  And bad at other things, too (see the agency mental model, which discusses  learned helplessness research all the way back to Martin Seligman, etc.)  Briefly, I’ll re-quote the earlier Peter Thiel quote:

If you expect an indefinite future ruled by randomness, you’ll give up on trying to master it.

Pages 154 – 155???: Again, here, I’ll critique Tetlock a bit by pointing out how what he says doesn’t really translate to the real world.  Tetlock notes that many superforecasters use Google Alerts to follow new data points closely; one political scientist who was invited to participate dismissively called participants “unemployed newsjunkies.”

That guy’s attitude is, in some senses, appropriate – not to take anything away from the superforecasters, but there’s a reasonable utility point to be made here.  Taking the melting sea ice question, for example, it’s probably less relevant what the exact measurement in sea ice on any given day is (unless you’re trying to navigate a boat through the Arctice, I suppose) and more relevant whether or not sea ice is melting over the long term and what the consequences of that for society might be.  (I haven’t researched climate science whatsoever and am thus unqualified to have an opinion.)

Tetlock does provide some caveats at the bottom of the page, but he doesn’t really discuss theutility or opportunity cost angle with regard to precision or frequent updating here: the superforecasters were optimizing for the lowest Brier score (i.e., the most accurate set of decisions) over a given set of problems.

But of course, that’s an artificial set of constraints.  On page 128 of The Signal and the Noise (SigN review + notes), Nate Silver quotes Dr. Bruce Rose, the principal scientist at The Weather Channel:

“the models typically aren’t measured on how well they predict practical weather elements.  It’s really important if […] you get an inch of rain rather than ten inches of snow. That’s a huge [distinction] for the average consumer, but scientists just aren’t interested in that.”

As Tetlock says elsewhere in the book, you get what you measure – and if you measure for Brier scores, you get Brier scores.  In the real world, many of the GJP questions are useless and trivial, whereas some are probably very important, and many important questions weren’t asked at all.  So, for example, outside of the game, the right answer to the Arctic sea ice question is to use the long term average, call it a day, and move on to some bigger/better question – it’s totally irrelevant what your Brier score on that one is.

It’s obvious when you put it like this because most readers are likely not going to have any trouble not caring about the exact measurement of some sea ice somewhere – but in a business context, altogether too often, business leaders focus on precisely tracking lagging rather than leading indicators.  Does it really matter, on a week-to-week basis, what your market share is of projects that are bid?  Or should you rather be focused on providing great customer service, doing R&D to deliver top-notch products, and making sure your cost structure is low so you can bid aggressively but still deliver solid margins?  

Put in this context, given that time in a day is limited, there’s a huge opportunity cost to having overly-precise, frequently-updated forecasts for stuff that just doesn’t matter in the long run, and you should always keep in mind what the utility – or lack thereof, of any given forecast/decision is, so you can allocate time appropriately.

In my line of work as a value investor, trying to track every leading indicator for, say, comps next quarter for some restaurant, would be a total waste of my time.  It doesn’t matter whether they’re down 3.2% or 1.3% or up 1.2%.  It’s just totally irrelevant. If the difference between success and failure is 100 bps of comps, it’s not a two-foot hurdle and I’m doing it wrong.

I use Google Alerts sometimes, but usually only for “big” stuff – for example, one investment I had was fairly heavily reliant on the availability of FHA mortgages, so if the government made any meaningfully restrictive moves on credit availability via the FHA, it would’ve been a clear negative, and I would have wanted to know that.  So I set up weekly Google Alerts for a variety of relevant search terms. Nothing ever came of it, but I don’t regret doing it.

Pages 158 – 159: Here’s the dumb sea ice question (nobody cares).  The point, though, is that you can either underreact or overreact to new information.

The interesting thing, in my view, is that most people tend to do both.  There’s strong recency bias in the way we process new information, whipsawing our emotions and judgments around, but there’s also strong confirmation bias and (as Tetlock discusses here) schema problems that, in the long-term, cause us to underreact to the cumulative weight of new information that contradicts our existing beliefs.  See, among other boks, “ Mistakes were Made (but not by me)” MwM review + notes).

Pages 161 – 162: Here’s the ego/identity deal.  Not clear whether it’s Tetlock’s metaphor or someone else’s (it seems familiar), but he visualizes beliefs as Jenga blocks that our identity is built on… we’re willing to swap out the top ones no problem (oh, this is my new favorite restaurant, that old one isn’t as good!) but we’re unwilling to change the ones that are “core” to our belief system because it can be very disorienting.  A lot of my friends who grew up religious went through this.

See Dr. Judith Beck’s “ Cognitive Behavior Therapy ( CBT review + notes) for a phenomenal overview of “collaborative empiricist” approaches for modifying beliefs, including how to get to those gnarly core beliefs.

Page 164: I’ll throw Kahneman a bone here.  Tetlock should also have mentioned the “Linda the feminist bank teller” example…

Page 165: here’s what I was talking about re: utility.  To some extent.

Pages 170 – 172: One of the things Tetlock does particularly well is articulate something I’ve had in my head but not put down on paper – the idea of qualitative Bayesian reasoning.  Yay priors.

Pages 174 – 176: Tetlock gets major brownie points by bringing up the growth mindset.  This is one of those few “free lunches” out there…

Pages 181 – 182, 185 – 186: Tetlock notes that clear and timely feedback is the only way we get better.  I discussed this earlier so I won’t do it here.  Decision journaling is a great way to do this; I’ve set up my research process so that I’m almost forced (forcing functions, an example of structural problem solving) to evaluate past mistakes.

Pages 191 – 192: these two pages are a great summary that should be returned to repeatedly until they’re part of your schema.

Page 196: Tetlock notes the deleterious effects of social proof here as it relates to the Bay of Pigs disaster.

Page 200: Here’s an “alternate history” on a more functional group… socratic questioning

Page 207: in favor of soliciting outside feedback: teams, of both ordinary and superforecasters outperformed individuals.

Page 210: some more explanation of the extremizing algorithm.

Pages 219, 221: on central planning s. Distributed decision-making

Page 228!: this is a really important concept that is hard to grasp (it took me a really long time).  How do you bridge the gap between appropriate intellectual humility and believing that you’re good at what you do, and not having self-doubt?  You can believe you’re better than other people and still be humble.

Page 236B: here is Tetlock’s response to Kahneman’s quasi-nihilism:

“My sense is that some superforecasters are so well practiced in System 2 corrections – such as stepping back to take the outside view – that these techniques have become habitual.  In effect, they are now part of their System 1. 

[…] No matter how physically or cognitively demanding a task may be – cooking, sailing, surgery, operatic singing, flying fighter jets – deliberative practice can make it second nature.

Ever watch a child struggling to sound out words and grasp the meaning of a sentence?  That was you once.  Fortunately, reading this sentence isn’t nearly so demanding for you now.”  

Yes, obviously: habit / conditioning extends beyond just physical actions to thought processes as well.  Dr. Judith Beck’s “ Cognitive Behavior Therapy ( CBT review + notes), referenced earlier, notes this as well: once patients are trained in CBT, they automatically do a reality check and “spontaneously (i.e. without conscious awareness) respond to the [automatic] thought in a productive way.”

Vs. Kahneman, at the end of Thinking Fast and Slow, basically telling the reader (inaccurately) that reading the book was a waste of time because individuals don’t have a very good chance of improving.  Deplorable.

See also Laurence Gonzales’s “ Deep Survival (DpSv review + notes), which touches on habit pretty hard and – among other things – starts out with a fascinating in-depth exploration of the training of fighter pilots (which I think Tetlock was just throwing out there).

Or, see Sunstein/Thaler’s “ Nudge – on page 20, S/T note that Americans use what looks like System 1 to deal with Fahrenheit, but System 2 to deal with Celsius; Europeans, obviously, do the opposite.  This suggests, obviously, that System 1 is trainable via  conditioning.

Pages 237 – 242, pages 244 – 245: and now Tetlock’s response to Taleb’s brand of nihilism: RE Taleb:

“History is not just about black swans […] slow, incremental change can be profoundly important […] a point often overlooked is that […] antifragility is costly […] why not prepare for an alien invasion? […] the answers hinge on probabilities […] judgments like these are unavoidable […] to be sure, in the big scheme of things, human foresight is puny, but it is nothing to sniff at when you live on that puny human scale.”

Opportunity costs utilitycompoundingprobabilistic thinking, etc. Contrast this to Taleb’s unrealistic and utterly unhelpful dogma of, as Tetlock calls it, “what matters can’t be forecast” and “forecasting is bunk.”

Tetlock does believe that counterfactuals are helpful.

Page 254T: Tetlock notes that accuracy is usually only one of the goals of forecasting; he cites an analyst in Brazil who said something that the politicians didn’t like.  See also Mike Mayo’s Exile on Wall Street, which sums up how this affects the sell-side…

Page 257: football coaches used to be sadists!  There is a reference here to idiots like Amy Chua and sensible people like Kim Wong Keltner…

Pages 260 – 261: Here, Tetlock  acknowledges utility by stating that “not everything that counts can be counted.”  He points out that numbers are not “sacred totems offering divine insights” – instead, they’re tools, nothing more.”  It’s a similar conclusion to that reached by Jordan Ellenberg and Nate Silver. As Silver puts it in ‘ The Signal and the Noise” SigN review + notes):

“Numbers have no way of speaking for themselves.  We speak for them. We imbue them with meaning. […] It is when we deny our role in the process that the odds of failure rise.  Before we demand more of our data, we need to demand more of ourselves.

Tetlock goes a bit further, noting that Brier scores:

 “treat false alarms the same as misses.  But when it comes to things like terrorist attacks, people are far more concerned about misses than false alarms.”  

He notes, reasonably enough, that you can adjust the scoring to account for this by weighting false positives differently from false negatives.  Marginal utilitymargin of safetyprecision vs. accuracy.

Page 262: Here, he also acknowledges that the bigger, more fundamental questions sometimes can’t be scored,” and that the IARPS tournament, in a certain sense, included “little” questions that don’t matter but can be scored.

Page 263: He brings up the intriguing concept of “Bayesian question clustering” and posits that the big, interesting questions are additive versions of the smaller time-dependent ones.

I’m not so sure this is the answer – again, in business/investing, it’s possible to be completely right directionally over the long term, but completely wrong over the short term.

For example, it seems pretty clear that autonomous driving and electric vehicles have a very high probability of being important trends in the automotive sector, and more broadly.  Does it really matter how big or how soon?  No – the cost of missing out, if you’re a business that plays in these areas, is fatal.

In some sense, yes, the probability of X autonomous cars on the road by Y data is literally additive, but that forecast misses the point.  A reasonable expectation for a stock price in 3 years may be the sum of the interim movements, but that doesn’t mean that your forecast for the stock price in 3 years should be based on forecasting the monthly movements…

Page 271: here’s a good example of intellectual humility.

Page 279: more MECE… and also, sort of, Parkinson’s Law.  Not quite though.

Page 292: Note 6 is fun.

Page 299: Tetlock notes that career pressures can drive extreme views… again, nobody likes a wishy-washy many-handed economist.

Page 302: Rosenzweig sighting!  Wish he’d referenced The Halo Effect ( Halo review + notes) in the text.

Page 308: See also Mauboussin here.  “ The Success Equation ( TSE review + notes).

 

First Read: spring 2018

Last Read: spring 2018

Number of Times Read: 1

Planning to Read Again?: nope

 

Review Date: spring 2018

Notes Date: spring 2018