KDnuggets Home » News » 2012 » May » Publications » Data Science Global Hackathon Report: Incompetence borne of excessive cleverness  ( < Prev | 12:n11 | Next > )

Data Science Global Hackathon Report:
Incompetence borne of excessive cleverness


 
  
Derek Jones reports on Data Science Global Hackathon - what did they do with air-quality training dataset and what did they learn from their mistakes.


April 30th, 2012 Derek-Jones

Data Science Hackathon I have just got back from the 24 hour Data Science Global Hackathon; I was an on-site participant at Hub Westminster in London (thanks to Carlos and his team for doing such a great job looking after us all {around 50 turned up from the 100 who registered; the percentage was similar in other cities around the world}). Participants had to be registered by 11:00 UTC, self form into 3-5 person teams ready for the start at 12:00 UTC and finish 24 hours later. The world-wide event had been organized by our London hosts who told us they expected the winning team to come from those in the room; Team Outliers (Wang, Jonny, Kannappan, Bob, Simon, yours truely and Fran for the afternoon) started in an optimistic mood.

At 12:00 an air-quality training dataset + test points was made available and teams given the opportunity to submit eight predictions in each of the two 12 hour time periods. The on-line submissions were evaluated by Kaggle (one of the sponsors, along with EMC) to produce a mean estimated error that was used to rank teams.

The day before the event I had seen a press release saying that the task would involve air-quality and a quick trawl of the Internet threw up just the R package I needed, OpenAir; I also read a couple of Wikipedia articles on air pollution.

Team Outliers individually spent the first hour becoming familiar with the data and then had a get together to discuss ideas. Since I had a marker pen, was sitting next to a white-board and was the only person with some gray hair I attempted to manage the herding of the data science cats and later went on to plot the pollution monitor sites on Google maps as well as producing some visually impressive wind Roses (these did not contribute anything towards producing a better solution but if we had had a client they could have been used to give the impression we were doing something useful).

Read more.


KDnuggets Home » News » 2012 » May » Publications » Data Science Global Hackathon Report: Incompetence borne of excessive cleverness  ( < Prev | 12:n11 | Next > )