Personalized search for all at Google, Greg Linden
As has been widely reported, Google is now personalizing web search results for everyone who uses Google, whether logged in or not.
Business Analytics vs. Business Intelligence, Dean Abbott
I used to be one that thought the term "data mining" would stay as the description of the kind of analytic work I do. To a large degree it has, but there are always new spins on things, and it seems that quite often in the business world, Predictive Analytics or Business Analytics are the terms of the day.
Interview with Donald Farmer Microsoft, by Ajay Ohri
AI Safety, by John Langford
Dan Reeves introduced me to Michael Vassar who ran the Singularity Summit and educated me a bit on the subject of AI safety which the Singularity Institute has small grants for.
I still believe that interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel. On these grounds alone, we could judge that developing AI is much more safe than not. Nevertheless, there is a basic reasonable fear, as expressed by some commenters, that AI could go bad.
When sharing isn't a good idea, Tim Manns
Ensemble models seem to be all the buzz at the moment. The NetFlix prize was won by a conglomerate of various models and approaches that each excelled in subsets of the data.
More on the Statistical Revolution - a SAS Story (continued), by Steve Miller
Health and Status Monitoring, by Robert Grossman
As anyone who has investigated potential data quality problems knows, identifying roots causes of potential problems is not easy and Shewhart also introduced a four step approach to these types of investigations that became known as the Shewhart Cycle, the Deming Cycle or the Plan-Do-Check-Act Cycle:
The new tracking snippet loads faster, and offer greater accuracy
Depth and Discovery: Powering Visualizations with the Google Analytics API, By Chris Gemignani
We were approached by the Google Analytics API team to find ways to explore new ways of looking at data with the API, and we were excited by the possibilities. We've been working on our own visualization framework, JuiceKit, that integrates the power of the Flare Visualization Library with Adobe Flex.
As Data Miners, we are intimately familiar with data manipulation algorithms and statistics. We enjoy tweaking predictive models to reduce classification error rates by a mere 2%. We understand the principles behind stochastic gradient boosting or the Bayesian information criterion. But the reality is that it's a pretty lonely place.