KDnuggets : News : 2006 : n13 : item4 < PREVIOUS | NEXT >

Features

From: Gregory Piatetsky-Shapiro
Subject: Nuggets from Data mining blogs

Greg Linden writes about

Yahoo building a Google FS clone?

The Hadoop open source project is building a clone of the powerful Google cluster tools Google File System and MapReduce.


I added a chart showing how many blogs on Technorati mention data mining.

See www.kdnuggets.com/websites/blogs.html

The chart shows peak in mid-May, which is shortly after USA Today broke a story on NSA data mining on May 11, 2006.


John Taylor offers Top 10 Excuses for not automating decisions


Marcos Campos writes on High Performance Scoring with Oracle Data Mining

A recent white paper at the Oracle Data Mining website describes how Oracle Data Mining can scale to score millions of records with modest off-the-shelve hardware. The paper shows some results that complement those in a paper presented at VLDB last year. This type of capability is what makes it possible real-time scoring as described in this series of posts.


Matt Cutts writes about Bot Obedience: Herding Googlebot

At a site or directory level, I recommend an .htaccess file to add password protection to part of a domain. I wrote a quick example of setting up an .htaccess file about this time last year. I’m not aware of any bot (including Googlebot) that guesses passwords, so this is quite effective at keeping content out of search engines.

At a site or directory level, I also recommend a robots.txt file. Google provides a simple robots.txt checking tool to test out files before putting them live.


KDnuggets : News : 2006 : n13 : item4 < PREVIOUS | NEXT >

Copyright © 2006 KDnuggets.   Subscribe to KDnuggets News!