Analytic / Data Mining Blog Highlights

Interesting observations from recent blogs ... Google is now personalizing web search results for everyone who uses Google, whether logged in or not

Personalized search for all at Google, Greg Linden

As has been widely reported, Google is now personalizing web search results for everyone who uses Google, whether logged in or not.

Business Analytics vs. Business Intelligence, Dean Abbott

I used to be one that thought the term "data mining" would stay as the description of the kind of analytic work I do. To a large degree it has, but there are always new spins on things, and it seems that quite often in the business world, Predictive Analytics or Business Analytics are the terms of the day.

Interview with Donald Farmer Microsoft, by Ajay Ohri

AI Safety, by John Langford

Dan Reeves introduced me to Michael Vassar who ran the Singularity Summit and educated me a bit on the subject of AI safety which the Singularity Institute has small grants for.
I still believe that interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel. On these grounds alone, we could judge that developing AI is much more safe than not. Nevertheless, there is a basic reasonable fear, as expressed by some commenters, that AI could go bad.

When sharing isn't a good idea, Tim Manns

Ensemble models seem to be all the buzz at the moment. The NetFlix prize was won by a conglomerate of various models and approaches that each excelled in subsets of the data.

More on the Statistical Revolution - a SAS Story (continued), by Steve Miller

Health and Status Monitoring, by Robert Grossman

As anyone who has investigated potential data quality problems knows, identifying roots causes of potential problems is not easy and Shewhart also introduced a four step approach to these types of investigations that became known as the Shewhart Cycle, the Deming Cycle or the Plan-Do-Check-Act Cycle:

Google Analytics launches asynchronous tracking

The new tracking snippet loads faster, and offer greater accuracy

News on R Commercial Development -Rattle- R Data Mining Tool, Ajay Ohri

Depth and Discovery: Powering Visualizations with the Google Analytics API, By Chris Gemignani

We were approached by the Google Analytics API team to find ways to explore new ways of looking at data with the API, and we were excited by the possibilities. We've been working on our own visualization framework, JuiceKit, that integrates the power of the Flare Visualization Library with Adobe Flex.

Guest Post: Dominic Pouzin from Data Applied

As Data Miners, we are intimately familiar with data manipulation algorithms and statistics. We enjoy tweaking predictive models to reduce classification error rates by a mere 2%. We understand the principles behind stochastic gradient boosting or the Bayesian information criterion. But the reality is that it's a pretty lonely place.