The Book of Why

Judea Pearl has made noteworthy contributions to artificial intelligence, Bayesian networks, and causal analysis. These achievements notwithstanding, Pearl holds some views many statisticians may find odd or exaggerated.


UCLA computer scientist Judea Pearl has made noteworthy contributions to artificial intelligence, Bayesian networks, and causal analysis. These achievements notwithstanding, Pearl holds some views many statisticians may find odd or exaggerated. Here are a few examples from his latest book, The Book of Why: The New Science of Cause and Effect, co-authored with mathematician Dana Mackenzie. 


“Causality has undergone a major transformation…from a concept shrouded in mystery into a mathematical object with well-defined semantics and well-founded logic. Paradoxes and controversies have been resolved, slippery concepts have been explicated, and practical problems relying on causal information that long were regarded as either metaphysical or unmanageable can now be solved using elementary mathematics. Put simply, causality has been mathematized.” 


“Despite heroic efforts by the geneticist Sewall Wright (1889–1988), causal vocabulary was virtually prohibited for more than half a century…Because of this prohibition, mathematical tools to manage causal questions were deemed unnecessary, and statistics focused exclusively on how to summarize data, not on how to interpret it.” 


“…some statisticians to this day find it extremely hard to understand why some knowledge lies outside the province of statistics and why data alone cannot make up for lack of scientific knowledge.” 


“Even if we choose them at random, there is always some chance that the proportions measured in the sample are not representative of the proportions in the population at large. Fortunately, the discipline of statistics, empowered by advanced techniques of machine learning, gives us many, many ways to manage this uncertainty—maximum likelihood estimators, propensity scores, confidence intervals, significance tests, and so forth.” 


“…we collect data only after we posit the causal model, after we state the scientific query we wish to answer, and after we derive the estimand. This contrasts with the traditional statistical approach...which does not even have a causal model.” 


“If [Karl] Pearson were alive today, living in the era of Big Data, he would say exactly this: the answers are all in the data.” 


“Statisticians have been immensely confused about what variables should and should not be controlled for, so the default practice has been to control for everything one can measure. The vast majority of studies conducted in this day and age subscribe to this practice. It is a convenient, simple procedure to follow, but it is both wasteful and ridden with errors. A key achievement of the Causal Revolution has been to bring an end to this confusion. At the same time, statisticians greatly underrate controlling in the sense that they are loath to talk about causality at all, even if the controlling has been done correctly. This too stands contrary to the message of this chapter: if you have identified a sufficient set of deconfounders in your diagram, gathered data on them, and properly adjusted for them, then you have every right to say that you have computed the causal effect X->Y (provided, of course, that you can defend your causal diagram on scientific grounds).” 


“…until recently the generations of statisticians who followed Fisher could not prove that what they got from the RCT [Randomized Control Trials] was indeed what they sought to obtain. They did not have a language to write down what they were looking for—namely, the causal effect of X on Y.” 


“The very people who should care the most about ‘Why?’ questions—namely, scientists—were laboring under a statistical culture that denied them the right to ask those questions.” 


“…statistical estimation is not trivial when the number of variables is large, and only big-data and modern machine-learning techniques can help us to overcome the curse of dimensionality.” 

Pearl also has some harsh words for data science:


“We live in an era that presumes Big Data to be the solution to all our problems. Courses in ‘data science’ are proliferating in our universities, and jobs for ‘data scientists’ are lucrative in the companies that participate in the ‘data economy.’ But I hope with this book to convince you that data are profoundly dumb. Data can tell you that the people who took a medicine recovered faster than those who did not take it, but they can’t tell you why.” 


“…many researchers in artificial intelligence would like to skip the hard step of constructing or acquiring a causal model and rely solely on data for all cognitive tasks. The hope—and at present, it is usually a silent one—is that the data themselves will guide us to the right answers whenever causal questions come up.” 


“Another advantage causal models have that data mining and deep learning lack is adaptability.” 


“Like the prisoners in Plato’s famous cave, deep-learning systems explore the shadows on the cave wall and learn to accurately predict their movements. They lack the understanding that the observed shadows are mere projections of three-dimensional objects moving in a three-dimensional space. Strong AI requires this understanding.” 


“…the subjective component in causal information does not necessarily diminish over time, even as the amount of data increases. Two people who believe in two different causal diagrams can analyze the same data and may never come to the same conclusion, regardless of how ‘big.’ the data are.” 

Though he can come across as an Ivory Tower academic whose arguments at times are muddled and contradictory, I suspect few seriously interested in the subject of causation consider Judea Pearl or his work bland or irrelevant. He is always thought-provoking and has much to say that should be heeded, and has provided statisticians and researchers with another set of useful tools for causal analysis. He should also be read because of his influence. In particular, I would recommend his Causality: Models, Reasoning and Inference to researchers and statisticians, though The Book of Why is a gentler introduction to his thinking.

He has not revolutionized the analysis of causation, though, and as noted, many statisticians will probably find at least some of his opinions out of sync with their own perceptions of statistics and their professional brethren, as well as the history of their discipline. He makes many generalizations regarding what “all or most statisticians” would do in a given situation, and then shows us that this would be wrong. He offers no evidence in support of these generalizations, many of which strike me as what a competent statistician would notdo. Likewise, some of what he claims is new or even radical thinking may cause some statisticians to scratch their heads since it’s what they've done for years, though perhaps under a different name or no name at all.

His charge that statisticians are focused on summarizing data, ignore the data generating process, and are uninterested in theory and causal analysis is particularly amusing in light of the at times acrimonious discussions between statisticians and data scientists from other backgrounds. He also disregards the myriad complex experimental designs and analyses of data obtained through these designs via ANOVA and MANOVA – which also can become quite complex – that explicitly consider a causal framework. These designs as well as ANOVA and MANOVA have been in use for many decades. Related to this, historically, statisticians have specialized in particular disciplines, such as agriculture, economics, pharmacology and psychology, because subject matter expertise is necessary for them to be effective - statistics is not just mathematics.

More fundamentally, all statisticians are not equally competent or will even define competence in the same way. There are also academic statisticians and applied statisticians and often wide gulfs between the two. All of this is certainly true of just about any profession. Not all lawyers are ambulance chasers, and practicing attorneys are not clones of their former law school professors.

There is also the sensitive matter that the advice of statisticians has often been ignored by researchers in many fields, and that this frequently is the reason for dubious practices in these disciplines, not deficiencies of statistics itself. For example, statisticians have frequently advised researchers against drawing causal implications based on correlations alone in the absence of sound theory and a causal model based on this theory, often to no avail. Note that this is very different from claiming correlation is irrelevant or that it means no causation is present. It differs starkly from believing causation is immaterial and should not be explored. Statisticians also contribute to the design of research and lengthy discussions about causation are not unusual. Most researchers are interested in the why and statisticians who aren’t are an endangered species.

Furthermore, practitioners, who enormously outnumber Ivory Tower statisticians, are sometimes given data with little background information and asked to find something “interesting.” In effect, they are ordered to data dredge. In these circumstances, any number of ad hoc causal models created manually or with automated software may fit the data about equally well but suggest very different courses of action for decision-makers. This can turn into a tightrope walk. More commonly, they may be given a set of cross tabulations and asked for their interpretation. It could be that something in the data does not make sense to the researcher or that s/he wants a second opinion.

This is a snapshot of the real world of a statistician as I see it, and it is a very different world from the one Pearl sees.

My comments should be not interpreted as suggesting that none of Pearl’s criticisms of statistics and statisticians have merit, though I do find many bizarre. (To be clear, he does hold some statisticians in high regard.) Like Pearl himself, statisticians shouldn’t be deified and surely a few decades ahead some of what is now widely-accepted as good statistical practice will be widely-regarded as boneheaded. Like any field, statistics evolves, often slowly and unsteadily.

The Great Depression, WWII, the Korean War and the Cold War surely had some impact on its historical development. Let’s also not forget that it was not that long ago when calculations were done manually, statisticians had limited empirical data to work with and were unable to conduct Monte Carlo studies, now essential for many statisticians. Still, I agree with Pearl that statistics and causal analysis would have progressed more rapidly if Sewall Wright’s contributions, especially path analysis, had received the attention they merited.

To sum up, there is much real wisdom in Pearl’s writings and, for what my opinion is worth, I would urge statisticians and researchers to read him. In The Book of Why, especially, he provides many vivid examples of how to do research wrong and how to do it better. Apart from his strange opinions about statistics and statisticians, my main criticism of Pearl is that he overstates his case. For example, in my experience path diagrams and DAG are aids to causal analysis, but not essential. Some people find them confusing. He also appears little interested in sampling and data quality, or with the possibility that dissimilar causal models may operate for latent classes.

There are many other books on or related to the analysis of causation. Experimental and Quasi-Experimental Designs (Shadish et al.), Explanation in Causal Inference (VanderWeele), Counterfactuals and Causal Inference (Morgan and Winship), Causal Inference (Imbens and Rubin) and Methods of Meta-Analysis (Schmidt and Hunter) are some others I can recommend. All are quite technical. For the philosophically inclined, The Oxford Handbook of Causation (Beebee et al.) may be the first port of call. Statisticians and others with a good background in statistics may also be interested in this debate which appeared in The American Statistician in 2014.

In closing, a general observation I would offer about the analysis of causation is really a reminder that theories are often constructed (or taken apart) in small bits and pieces over many years through the hard work of many independent researchers. Frequently, these small bits and pieces can only be tested through experimentation since the requisite observational data does not exist.

Most of these experiments are only reported in academic journals and go unnoticed except to specialists working in that area. Many are quite simple and do not require sophisticated mathematics and fancy software. They do not test grand theories that have direct and sweeping implications for public health and welfare. They are the regular guys of research and the unsung heroes of causal analysis. 

Bio: Kevin Gray is President of Cannon Gray, a marketing science and analytics consultancy. He has more than 30 years’ experience in marketing research with Nielsen, Kantar, McCann and TIAA-CREF. Kevin also co-hosts the audio podcast series MR Realities.

Original. Reposted with permission.


Editor: as pointed by our reader Carlos Cinelli below, Judea Pearl has replied to Kevin Gray article - we reproduced his reply below.

Judea Pearl:
Kevin’s prediction that many statisticians may find my views “odd or exaggerated” is accurate. This is exactly what I have found in numerous conversations I have had with statisticians in the past 30 years. However, if you examine my views more closely, you will find that they are not as whimsical or thoughtless as they may appear at first sight.

Of course many statisticians will scratch their heads and ask: “Isn’t this what we have been doing for years, though perhaps under a different name or not name at all?” And here lies the essence of my views. Doing it informally, under various names, while refraining from doing it mathematically under uniform notation has had a devastating effect on progress in causal inference, both in statistics and in the many disciplines that look to statistics for guidance. The best evidence for this lack of progress is the fact that, even today, only a small percentage of practicing statisticians can solve any of the causal toy problems presented in the Book of Why.

Take for example:

  • Selecting a sufficient set of covariants to control for confounding
  • Articulating assumptions that would enable consistent estimates of causal effects
  • Finding if those assumptions are testable
  • Estimating causes of effect (as opposed to effects of cause)
  • More and more.

Every chapter of The Book of Why brings with it a set of problems that statisticians were deeply concerned about, and have been struggling with for years, albeit under the wrong name (eg. ANOVA or MANOVA) “or not name at all.” The results are many deep concerns but no solution.

A valid question to be asked at this point is what gives humble me the audacity to state so sweepingly that no statistician (in fact no scientist) was able to properly solve those toy problems prior to the 1980’s. How can one be so sure that some bright statistician or philosopher did not come up with the correct resolution of the Simpson’s paradox or a correct way to distinguish direct from indirect effects? The answer is simple: we can see it in the syntax of the equations that scientists used in the 20th century. To properly define causal problems, let alone solve them, requires a vocabulary that resides outside the language of probability theory, This means that all the smart and brilliant statisticians who used joint density functions, correlation analysis, contingency tables, ANOVA, etc., and did not enrich them with either diagrams or counterfactual symbols have been laboring in vain — orthogonally to the question — you can’t answer a question if you have no words to ask it. (Book of Why, page 10)

It is this notational litmus test that gives me the confidence to stand behind each one of statements that you were kind enough to cite from the Book of Why. Moreover, if you look closely at this litmus test, you will find that it not just notational but conceptual and practical as well. For example, Fisher’s blunder of using ANOVA to estimate direct effects is still haunting the practices of present day mediation analysts. Numerous other examples are described in the Book of Why and I hope you weigh seriously the lesson that each of them conveys.

Yes, many of your friends and colleagues will be scratching their head saying: “Hmmm... Isn’t this what we have been doing for years, though perhaps under a different name or not name at all?” What I hope you will be able to do after reading “The Book of Why” is to catch some of the head-scratchers and tell them: “Hey, before you scratch further, can you solve any of the toy problems in the Book of Why?” You will be surprised by the results — I was!

For me, solving problems is the test of understanding, not head scratching. That is why I wrote this Book.

Editor: Kevin Gray has noted Judea Pearl's reply and has provided this follow up response below.

I appreciate Judea Pearl taking the time to read and respond to my blog post. I’m also flattered, as I have been an admirer and follower of Pearl for many years. For some reason - this has happened before - I am having difficulty with Disqus and have asked Matt Mayo to post this (admittedly hastily written) response on my behalf.
In short, I do not find Pearl’s comments regarding my post substantive, as he essentially restates views I have questioned in the piece.

The suggestion of his opening comment is that I am only superficially acquainted with his views. Obviously, I do not feel this is the case. Also, nowhere in the article do I describe any of Pearl’s views as whimsical or thoughtless. I take them very seriously, or I wouldn’t have bothered writing my blog article in the first place.

Like many other statisticians, including academics, I feel his characterizations of statisticians are inaccurate, however, and that his approach to caution is overly simplistic. That is the essence of my views (and, as I indicated, I am not alone). There are many possible reasons for these discrepancies, I will not speculate here as to Why. 😊

That “you can’t answer a question if you have no words to ask it” is certainly true. Practicing statisticians - as opposed to theoretically focused academics - work closely with their clients, who are specialists in a specific area. The language of that field largely defines the language of causation used in a particular context. That is why there has been no universal causal framework, and why statisticians in psychology, healthcare and economics, for instance, have different approaches and often use different language and mathematics. I myself draw upon all three, as well as Pearl’s. The utility of each, in my 30+ years’ experience, is case-by-case. There are also ad hoc approaches (some questionable).

He claims that “…even today, only a small percentage of practicing statisticians can solve any of the causal toy problems presented in the Book of Why” yet provides no evidence of this. They are not difficult, and claims are not evidence. Furthermore, the examples he gives in “Every chapter of The Book of Why” of the failure of statistics are, in general, not compelling and were part of the motivation for the article in the first place.

Simpson’s paradox in its various forms is something that generations of researchers and statisticians have been trained to look out for. And we do. There is nothing mysterious about it. (This debate regarding Simpsons, which appeared in The American Statistician in 2014, and which I link in the article, hopefully will be visible to readers who are not ASA members.)

Mediation, often confused with moderation, can be a tough nut to crack. Simple path diagrams or DAG with only few variables will frequently be inadequate and may badly mislead us. In the real world of a statistician, there are frequently a vast number of potentially relevant variables, including those we are unable to observe. There are also critical variables that, for many reasons, are not included in the data we have to work with and cannot be obtained. Measurement error may be substantial - this is not rare - and different causal mechanisms for different classes of subjects (e.g., consumers) may be in operation. These classes are often unobserved and unknown to us.

I should make clear that I greatly appreciate Pearl’s efforts at bringing the analysis of causation into the spotlight. I would urge statisticians to read him but also consult (as I do) with other veteran practitioners and academics for their views on his writings and on the analysis of causation. There are many ways to approach causation and Pearl’s is but one. As Pearl himself will be aware, to this day, philosophers disagree about what causation is, thus to suggest he has found the answer to it is not plausible in my view. Real statisticians know better than to look for a Silver Bullet.