Basketball Predictive Analytics: Will he take the shot?
Sports analytics has reached a new level - now researchers can predict whether and from where a basketball player will take shot, Check a fun online app that lets you play with predictions.
At the core of any machine learning method is the question of how to abstract useful knowledge from individual examples. When dealing with spatial and continuous space, these examples are not only individual, but also often singular: you will certainly never see two identical examples because the space is continuous.
The brief – Predicting NBA players’ behaviours
Yisong Yue and his co-authors are tackling an extremely challenging version of this problem in their latest paper presented at the 2014 IEEE International Conference on Data Mining: How, from the location of all the players on the court, can we predict what the next move of an NBA player will be? Will he shoot? Will he pass? To whom?
This is a hard model to build from data, because no single player will ever shoot twice from the exact same location; no single player will play two games in the same way; no two games will have the same team strategies or progression; etc.
And even if you were to divide the court into a grid, and to slice time into consistent segments, you would still end up confronted by the reality that no player is ever going to be in the same situation twice: because the score is different, because his teammates and opponents are in different positions, because he received the ball from a different angle or because of any other factor.
Does that mean that we cannot predict whether Tim Duncan is about to take a 3-point shot? No, it just means that we have to be clever and particularly careful in developing learning methods.
The core – Matrix factorization
The ideas developed by this team from Disney Research (yes, Disney is interested in Basketball) relies on a method called Matrix Factorisation, which you may have heard about in association with ‘recommender systems’. Matrix factorization is the keystone of systems that predict the rating that you would give to, say, a movie, in which context most movies’ databases contain hundreds of thousands of titles, but we can only expect each user to rate a few dozens of them.
In this case, the idea underlying such recommender systems is that, even if no two persons will ever have watched all the same movies and given them identical ranks, we can assume that there is a set of underlying movie preferences, tastes, or 'latent variables' that can be identified from previous ratings in order to predict how each individual might rank other movies later.
Just as disparate individuals share movie preferences, NBA players may share similar shooting preferences and other on-court behaviours. We all know that many shots are made around the three point line, and many others right under the net. Yue et al. make the assumption that the singular behaviour of each NBA player is a composition of some tactics that are common among the whole pool of NBA players. Three of the tactics that they automatically learn from data are depicted below; basketball amateurs will directly recognize three standard shooting positions: Corner 3’s, High paint and Straight-up 2’s.
For example, Tim Duncan customarily occupies a central place either in the “High paint” (middle picture above) or in the “Straight-up 2’s” (right picture above).
Jeremy Lin is known to either shoot from the “Corner 3’s” (left picture above), or from the “High paint”. The pictures below depict how these preferences might be composed to explain individual shooting behaviour.
In the paper, the authors show how to infer the set of standard shooting preferences that will then consistently explain individual players' shooting behaviours. After methodical decompositions of the different elements that might influence the behaviour of a player, including their shooting preferences depending on their own positions and the positions of the team mates and opposing players, they are able to predict whether the player holding the ball is going to shoot or pass (and to whom). All of these predictions are available in real time as the game is in motion.
The authors have even developed an online app to let us play around with this prediction system for a Spurs vs Lakers game with the 2012-2013 roster (the very first picture in this article is a screenshot of the app; the thickness of the edges represent the likelihood of shooting [or passing] as you move the different players); very cool.
Starting from data with the positions of the players and information about shots and passes alone, I would say that they have managed to abstract some very interesting knowledge. Next step is adding team strategies to the mix, and I think that they are already on it. Could well prove useful for the next March Madness!
Bio: François Petitjean completed his PhD working for the French Space Agency in 2012, and is now a Research Fellow in Geoff Webb’s team at Monash University’s Centre for Data Science. He tweets at @LeDataMiner.