On Why Sequels Are Bad and Red Light Cameras Aren’t As Effective
Regression to the mean is a statistical phenomenon whereby extreme observations will tend to decrease (regress) towards the mean on subsequent readings. Regression to the mean is essentially a result of selection bias, learn more about it.
By Thomas Speidel, Suncor Energy.
Daniel Kahneman in his wonderful book Thinking Fast and Slow tells us the story of a young Kahneman teaching Israeli flight instructors on ways to improve training. Kahneman’s proposition was based on evidence that rewarding performance is more effective than punishing mistakes.
Kahneman then goes on to recount how his stance was challenged by a seasoned officer: “On many occasions I have praised flight cadets for clean executions of some aerobatic manoeuvres. The next time they try the same manoeuvre they usually do worse. On the other hand, I have often screamed into a cadet’s earphone for bad execution, and in general he does better on his next try. So please don’t tell us that rewards works and punishment does not, because the opposite is the case“.
Daniel Kahneman (http://livestream.com/TheNewSchool/DanielKahneman)
Let’s fast forward a few decades. Red light cameras are a common sight in large metropolitan areas (Calgary, Canada where I live, has roughly 50 such cameras). Their claimed effectiveness in decreasing accidents is often supported by overly optimistic and faulty analysis. City officials, concerned about safety, select intersections with a high accident rates as candidates for red light cameras. The next year accidents go down and the officials conclude red light cameras work.
As Kahneman has it, the seasoned air force officer and our city official have “attached causal interpretation to the inevitable fluctuations of a random process“. The conclusions they reached about the effectiveness of screaming into cadet’s ears or installing a red light camera were either wrong or overly optimistic. Both were the result of regression to the mean.
What is Regression to the Mean?
Regression to the mean is a statistical phenomenon whereby extreme observations will tend to decrease (regress) towards the mean on subsequent readings. Regression to the mean is essentially a result of selection bias (another statistical phenomenon): a cadet or an intersection are selected for intervention (screaming, a red light camera) because we have observed something extreme.
In this graphical simulation (be patient as it may take a while to load) the red circles on the left panel represent the extreme observations, that is, those extreme observations that we selected as a basis for action: the air cadet performing a bad manoeuvre, the intersections experiencing the highest accident rate. On the right panel we see the pre vs. post intervention difference. Notice how the line is always decreasing, that is, regressing to the mean.
Regression to the mean is an incredibly widespread phenomenon in all spheres of life. Martin Bland (2004) provides us with some everyday examples.
- Hollywood sequels tend to regress to the mean because a sequel is generally selected from a highly successful first release (an extreme observation) and so the quality of a sequel will tend to decrease anyway.
- A superstitious person may believe that being on the cover of a famous magazine often jinks that person’s success. But this, as we have seen, is a result of regression to the mean: a person’s is unlikely to make it on the cover of a famous magazine unless they are an extreme observations and so it will naturally regress towards the mean.
“Poor performance was typically followed by improvement and good performance by deterioration, without any help from either praise or punishment“
What can we do about it?
One of the best ways to tackle the problem at its source is by designing a robust experiment or analysis.
- In a past post, I described problems around attributing too much causal importance to time. I described that if we introduce some change at a known point in time and proceed to compare whether a change has occurred, it will be very hard to conclude if the change was effective. This challenge can be partly caused by regression to the mean. We can often do better by having a control group we can use to compare against.
- Bland suggests having duplicate baseline measurements. Suppose in our red light camera example, we compared accident rates between years 2010 (no red light camera) with 2013 (red light camera). Taking two sets of measurement at baseline, say 2010 and 2012 can help (it can also help increase accuracy of the effect estimates).
- Numerous statistical methodologies exist to either measure or circumvent regression to the mean, including many Bayesian methods.
“The feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.“
Why is it a problem?
Regression to the mean is a problem because it, and not something we did, maybe responsible for a change. A business initiative, an intervention, a new program, policy, or tool, aimed at addressing a problem maybe wasteful and ineffective at worse or overly optimistic at best. Either way, the illusions that what we did worked quickly disappears. In many organizations, we see a trail of short-lived initiatives that seem to work until they don’t and a new one is created, thus creating a vicious cycle.
Bio: Thomas Speidel, P.Stat., is a Statistician working as Data Scientist for Suncor Energy in Calgary, Alberta. He spent nearly ten years working in cancer research before moving to the “sandbox” of the energy industry.
- Interview: Ravi Iyer, Ranker on Dealing with Inherent Bias in Crowdsourcing Data
- Data Science and Prejudice – Blessing or Curse ?
- Doubt and Verify: Data Science Power Tools