Why Germany did not defeat Brazil in the final, or Data Science lessons from the World Cup

We review World Cup predictions (all failed), examine what makes such events difficult to predict, and suggest 3 golden rules to determine when you can trust the predictions.



(this blog was written jointly with Gregory Piatetsky).

The 2018 World Cup is over, with France defeating Croatia 4-2 in the final. It was a great match, to end a brilliant tournament, with the French deserved winners.

Now is the time for reflection, as we look back on the predictions we made from the beginning of the tournament.

World Cup 2018 Final

2018 FIFA World Cup Final, France vs Croatia.

Review

So how did it go? Will we be switching careers to become full-time betting experts? Or we will be cursing the fact that things put out on the Internet will be there forever?

World Cup 2018 Brackets

Fig. 1: Expected World Cup 2018 Brackets, with Germany vs Brazil in the final, as predicted by KDnuggets before the tournament start.

Well, lets just say it was mixed. 13 of the last 16 (81.25%) were correctly predicted, with only Poland, Germany and Egypt missing out and Japan, Sweden and hosts Russia taking their places.

At the quarter-final stage, 4 of the 8 teams were correctly predicted (50%), but from that point on it was a bit of a horror show. France were the only semi-finalists correctly predicted and in the predicted Brazil-Germany final, one of the sides went out at the quarter-final stage and the other didn’t even make it out of the group!

It wasn’t just us that got it wrong though; the FiveThirtyEight predictions had Brazil (19%), Spain (17%) and Germany (13%) all ahead of France (8%) as the winners. Gracenote’s predictions had the same three sides and even Argentina ahead of France. It turns out predicting something with as many variables as the World Cup can be difficult.

Lessons

So why did everyone get it so wrong? Here are some lessons we’ve learned:

Human aspect

Human behavior has a lot of randomness and so trying to use data science to predict it is difficult and offers limited accuracy. One particular example from the World Cup is when the Germany goalkeeper Manuel Neuer made a mistake leading to a goal in their 2-0 defeat to South Korea:

(Or French goalkeeper Lloris mistake leading to the second Croatia goal in the final match). Something like this is impossible to predict, likewise with any own goals and mistakes, it simply comes down to human behaviour.

External factors

Sport in general contains a lot of external factors that can hinder results. For example for football ( soccer), the result may be affected by an unfair referee, adverse weather conditions, the climate, the player’s personal lives and much more.  It’s very tricky to factor in these features, as they can be difficult to measure and collect.

Individual Events

Predicting the results in of the entire tournament requires predicting all the separate matches, and randomness tends to aggregate.  The knockout nature of the World Cup makes is harder to predict, as one defeat can send a team home.

Group behaviour

Predicting sports with individual competition, like baseball or chess, is easier than predicting team events.

Data science has limited accuracy when dealing with predicting group behaviour. Because team composition is changing all the time in soccer, we cannot draw many conclusions from Germany’s performance 12 years ago in comparison to Germany’s performance today.

Uncertainty range

Many predictions tend to be presented without a range of uncertainty.This range basically measures the lack of certainty, or a state of limited knowledge where it is impossible to exactly describe the existing state, a future outcome, or more than one possible outcome.

Rules

With all that in mind, here are our three golden rules for knowing when to trust predictions:

  • If there are mathematical laws (eg for games of chance like fair coin or dice) or physical laws (for example in astronomy, where positions of planets can be predicted very precisely).
  • If there is a lot of data on the same type of entity. Note that the Brazil team of 2010 isn’t going to be the same as the Brazil 2018 team.
  • If the predictions include a range of uncertainty, which usually indicates a good work with solid statistical foundations. When only a single number is provided without a standard deviation, it is probably more for entertainment, and you shouldn’t trust it.

Conclusion

This experience highlights how limited data science can be when predicting something controlled by human behaviour. It’s abundantly clear that these predictions are more for entertainment purposes, or to give a rough estimate, rather than an exact science.

Related: