ACM Data Mining Camp
November 16, 2010, By Joseph Rickert
I was very happy to be a part of the ACM Data Mining camp held last Saturday (November 13th) at eBay. It was a big day for discussing hot topics in data mining, Mahout, parallel SVMs etc, and also a pretty big day for R. Because Revolution Analytics was a sponsor for the camp, I got to give a three minute company pitch and was very pleased to have people applaud my "I 'heart' R" slide.
In addition to my brief presentation, I led a session on manipulating large data sets in R in that was attended by maybe 100 people. I was expecting to run some code real -time for a group of about twenty or so, but found myself instead up at the podium in the large conference room. Some adjustment was necessary for the larger audience, but I did show Revolution's RevoScaleR package running cubes (crosstabs) and regressions and plotting histograms etc. while working directly with the 123 million row airlines data set that was used in the 2009 ASA completion.