(By Joseph Rickert, Revolution Analytics) In San Jose topics like big data, map reduce, predictive models, mobile analytics and crowdsourcing draw a crowd even on a Saturday. So it turned out that the ACM Data Mining Camp and "un-conference" was a very "happening" way to spend a Saturday. Over 500 people attended the event at the eBay "Town Hall" on North First street and a good number stayed for the entire, eleven hour day (the food was pretty good).
The day started off Mike Bowles delivering a very accessible two hour lecture on using map reduce for machine learning that he and Patricia Hoffman abstracted from classes they teach at the Hacker Dojo and elsewhere. This was the only part of the day for which there was a fee ($35 that went to the ACM), the rest of the day was free (including the food).
In his lecture, Mike went through a number of popular data mining algorithms: Canopy Clustering, Kmeans, OLS, support vector machines etc, all examples of a class of models called the Statistical Query Model, and showed how they may be implemented as map reduce algorithms for Hadoop.
Michael Reece, VP of Modeling and Optimization and Optimization at Quantcast gave the keynote address: Machine Learning on Big Data for personalized Internet Advertising. This was a dynamic, high energy talk in which Michael moved seamlessly between the business of Internet advertising, quantitative techniques and insider observations like:
- "Most ads are being shown to the wrong person ... The good news is that the glass is 1% full"
- "When you set up a wish list, you will have someone advertising it to you until you buy"
- "When you get a free credit rating, your credit score will be stapled to your cookie. Delete your cookie immediately!"