# Predicting the London Olympics Medal Count and the Math behind It

Is it possible to predict how many medals each nation will win? Some surprising factors emerge as predictors of the number of medals.

By Dan Graettinger, DiscoveryCorps, July 2012

Will the United States retain its position as the top medal-winning nation at this year's Olympic Games in London, or will up-and-coming China capture the crown? Is it possible to predict how many medals each nation will win? Why do some countries take home a bundle of medals while others take home none at all? And what is it about a nation that allows it to produce Olympic medal-winning athletes?

It was these latter two questions that intrigued me the most. If we look at the medal counts for the two most recent Olympic Games (see Table 1), we see that the top two nations are the U.S. and China, who happen to be the 3rd and 1st most populous nations in the world. So population seems to be important. But where is India, the world's second most populous nation? Maybe wealth is the key factor. That seems to fit. A lot of the nations at the top of the list are the wealthier nations of the world. But how did Cuba and Belarus rank so high? As we think more and more about it, it quickly becomes clear that the why's behind the medal counts at the Olympics are complex. Fortunately, I'm a data miner, and my job is to find patterns in data and use those patterns to predict future events. And trying to predict the 2012 Olympic medal counts using data mining methods was too tempting to pass up!

Since the puzzle I wanted to solve focused on the characteristics of nations that lead to their success at the Olympics, I took a top-down approach -- looking purely at national measures. However, there are other ways to project the medal counts. A bottom-up approach would look at the top athletes in each event, assess their recent results, and assign individual odds of winning a medal. Then you can sum those individual odds across all 29 sports to get national totals. Since the nation-focused perspective would give us more explanatory power and insight into the "why" questions that captured our imagination, we chose that approach.

To project the medal counts using the top-down method, I first needed to compile data on the nations of the world that might shed some light on what makes a difference in the medal count. On the one hand, I wanted to collect data that my intuition said was important, like population, wealth, and development level. On the other hand, I wanted to hold the door open for other categories of data that could have an impact, like geography, history, religion, political organization, and personal freedoms. By linking each nation's data with its Olympic outcomes, perhaps patterns would emerge that would allow for a mathematical model to be created that would be predictive, while simultaneously giving insights that would answer my questions. (See Table 2 at the end of this article for the full list of variables and their sources that went into the dataset.)

For statistical reasons, we decided to try to predict which nations would win two or more medals. This would help eliminate some statistical "noise" in the data where a nation might win a medal due to a single outstanding individual. After that, we compared each of the variables against the outcome of winning two or more medals. This allowed us to find those characteristics of a nation that do and do not connect strongly with their medal count. So let's take a look at some of the expected, the sensible, and the downright head-scratching characteristics of a nation that relate to its ability to produce world champion athletes.

What Does Matter

• The single characteristic most closely associated with winning Olympic medals is ... number of internet users . My initial reaction was, "What the heck??!!" This is a good time to point out that good predictors may not actually cause the outcome, but rather go together with (correlate to) the outcome. After further thought, I realized that the number of internet users does tell us a lot about a country. The people are wealthy enough to afford computers and internet access. The population of the country is relatively large (since this piece of data measured the total number of users, not users per capita). Finally, the people have enough free time on their hands to engage in non-subsistence-related activities, like participating in sports or surfing the net!
• Total Gross Domestic Product - Here again we see an indication that a nation's wealth helps them to produce elite athletes. What's intriguing, though, is that the total GDP for the nation was far more predictive than GDP per capita. For example, in 2008, China had the second highest national GDP in the world, as well as the second most medals at the Olympics. Yet China's GDP per capita ranked them 134th in the world, behind nations like Thailand, Tunisia, and El Salvador. One possible explanation is that China's communist government, having access to the great combined wealth of the nation, diverted enough funds to their government-sponsored athlete development program to overwhelm the relative poverty of that nation's individuals.
• Total Population - Now that makes sense! With all else being equal, the more individuals a nation has, the more outstanding individuals there ought to be.
• Overall Economic Freedom - Each year the Heritage Foundation publishes a chart ranking nations on various aspects of freedom.

Isaac Nelson
I think each country has its own way of winning medals, and that the two most commonly cited reasons for the US's olympic success (population and GDP) are important, but only to a point. I'll use swimming as my main examples because I'm very familiar with the sport.

Population:
The main benefit of a larger population is that it gives you larger number of people who are suited, genetically and dispositionally, to a sport. Michael Phelps is a collection of swimming-beneficial genetic abnormalities that couple with his hypercompetitive nature to create the perfect olympic swimmer. The physical factors are genetic, and often so is the personality. Having a larger gene pool does give a country a better chance of producing such a specimen.

This advantage of a large population is lessened by two factors. The first is that only two competitors can compete for a country in any individual event, no matter how many athletes from that country can meet the qualifying standard. Most countries cannot muster enough athletes who can compete at the Olympic level in a wide range of sports. If you look at this year's US swim team trial times, on the other hand, you will see that in most events, there were 5, 6, even 7 swimmers who passed the Olympic standard time, yet were left out of the events because they were not in the top 2. Often, the difference between making the team and going home was a matter of hundredths of seconds, which means that American athletes with legitimate medal potential ability don't even make the team, except in relays as B team, and even then they only swim in the preliminaries to let the stars rest, and sit out during the medal races. That's really where population advantage kicks in, in relays and team sports, because of the depth population gives. But even then, the medal count advantage is limited, because a gold in basketball or baseball only counts as one gold for a country not five, or nine golds. In individual races, the advantages are far less pronounced. A single athlete from a small nation has just as much chance as an athlete from a large nation.
Saturday, July 28, 2012, 3:02:33 AM

Isaac Nelson
Second: Because a limited number of athletes can make the team, this has the effect of tightening up the elite programs with superior coaches and training methods. There is less reason to take chances on a larger pool of athletes with less proven talents, which has had the effect of making Olympic sports into niche sports. The US elite swim programs are the same size as the Australian elite programs, even though the US is ten times the size of Aus in population terms.The US might have a larger gene pool to draw from, but that begs the question: how many people does a country need to field a team of fifteen men with the genes and drive to be Olympians? 300 million, like the US? 30 million, like in Aus?

Final note on population: to demonstrate the absurdity of comparing population to medals on a straight scale, consider the following scenario: Lets say that Barbados, with its population of 284,000, wins one gold medal in sprints. They don't win any more medals in that Olympiad. That's a 284000:1 ratio of people to medals. Now, in a tremendous feat of athletic powerhousity, China goes berserk and wins EVERY SINGLE OTHER MEDAL, in every event in that Olympiad. That would give them 301 medals. You'd think that would close the deal and make China the undisputed Olympics giant. But wait, there's that per capita measure. China has 1,338,612,968 people. That's a 4,447,219:1 people to medals ratio. So even if China wins every single medal but the one that Barbados won, Barbados has a people to medals ratio that is nearly twenty times better than China's. If my math or my logic are wrong, please tell me, but I just don't see a per capita comparison making much sense there.

The US and China's sheer size is an advantage, but to me it's an advantage with diminishing returns the farther you go. If anything, I think the US's diversity of population is at least as much of an advantage. Certain kinds of people dominate sprints, certain kinds of people dominate distance running, or swimming, or weightlifting. If you look the different athletic disciplines, you will see a pattern of domination in many of them pertaining to areas of the world with similar genetic or cultural backgrounds. We have the widest variety of people in the US, which gives us individuals who bring with them advantages in heredity or cultural background in specific sports. China's advantage is in it's massive state sponsored sports machine, but that's something different. The USA by the way, doesn't fund it's Olympic hopefuls. They are on their own, seeking sponsors where they can get them.
Saturday, July 28, 2012, 3:07:30 AM

Isaac Nelson
One more:

GDP: This does make a large difference in creating free time for training, material support, healthiness of the population, etc. Some limiting factors in giving an advantage: US GDP has allowed universities and other research institutions to create a large body of research in sports and exercise physiology. This research is available to any coach, anywhere in the world. Along those lines, small GDP nations have benefitted from coaches who learned their craft in the large GDP nations, going to small GDP nations and building powerhouse programs. This happened in Jamaica, when a group of coaches from California university track programs organized them, introduced training methods from the US, which at the time was the undisputed king of track and field, and built them into the best group of sprinters in the world. From a physiological aspect, the raw athletic material was there for Jamaica, and the knowledge built in large GDP nations gave huge benefits to a small GDP nation from a medal count perspective.

There also is a large number of Olympians who train in countries with large GDPs, but compete for other countries. Milorad (Michael) Cavic, who Phelps famously edged out by .01 in Beijing, was born in Anaheim, swam at Tustin High and the University of Cal Berkeley, trained at American facilities with American coaches, and swims this week in London for the country of Serbia, where his father was born. Other athletes go to the US on athletic scholarships (dozens from the Caribbean every year), or come from wealthy families who send them to the US to study in colleges and compete on sports teams.

There are other factors, like climate, development of sporting culture but I've written enough already.

I know I've written a lot, but I appreciate you taking the time to read.
Saturday, July 28, 2012, 3:11:51 AM

Dony
Wait. The table looks weird for me. For me, it is China that was top medal winning nation in Beijing. You just simply count the medals and ignore whether it is gold, silver, and bronze. Anyway, it is up to you to stick with your definition. But for me, China is the top medal winning nation in Beijing.
Saturday, July 28, 2012, 3:20:00 AM

George Stevens
The Olympics listed for each athlete only include games when they won medals. The Top Medal winners In Olympics are from United states and its the top medal winning team at this year's Olympic Games in London.
Monday, July 30, 2012, 6:02:54 AM

