KDnuggets : News : 2006 : n18 : item3 < PREVIOUS | NEXT >

Features

From: Gregory Piatetsky-Shapiro
Subject: Blogs on Data Mining: Data Temp, sparklines, KDD-06, and more ...

Here are some interesting entries from Blogs on Data Mining and Analytics.

Dan E. Linstedt writes on
Temperature of Data for RDBMS, and DW 2.0

In the next generation of Database Engines, data can be HOT, Medium, Luke Warm, and Cold. For lack of a better definition, the Database engines are working on the following:

HOT = Accessed all the time, or extremely important data requiring sub-second response times. Hot data must reside in RAM continuously. ... Medium Data = is data accessed most of the time, but where response times can be anywhere from 1 second to 10 or 15 seconds. In this category, aggregation analysis, strategic analysis, and some levels of master data can be requested.

Greg Linden writes about
Digg struggles with spam

There have been some interesting posts lately on how Digg is struggling with manipulation and spam. ...
These problems with Digg were predictable. Getting to the top of Digg now guarantees a flood of traffic to the featured link. With that kind of reward on the table, people will fight to win placement by any means necessary. ...

Zach (JuiceAnalytics) writes on
MicroCharts, A Different Take on Excel Charting showing how to use "Sparklines", defined by Edward Tufte as "data-intense, design-simple, word-sized graphics". Very Neat!

Stephen Few also writes about Bullet graphs and sparklines for Excel

Marcos Campos writes in Oracle Data Mining and Analytics blog
on KDD-2006 conference, view from exhibits. Nice pictures!

Jeff Jonas writes about What is Data Mining? Depends Who You Ask ... (Sep 8, 2006).

Everyone has their own definition of data mining. My favorite is this one I heard at the ACM SIGKDD data mining and knowledge discovery conference a few weeks ago, specifically:

Data Mining, noun 1. Torturing the data until it confesses � and if you torture it enough, you can get it to confess to anything.

Here are some far less humorous definitions:

The Government Accountability Office produced the following definition for data mining:

"The application of database technology and techniques�such as statistical analysis and modeling�to uncover hidden patterns and subtle relationships in data and to infer rules that allow for the prediction of future results." ...

New blog: Sandro Saitta Data Mining Research blog.

Sandro gives a book review of "Introduction to Support Vector Machines and other kernel-based learning methods (Cristianini and Shawe-Taylor, 2000)." and writes about "The dark side of data mining", a proposed plan to use microphones to listen to PC users.

KDnuggets : News : 2006 : n18 : item3 < PREVIOUS | NEXT >

Copyright © 2006 KDnuggets.   Subscribe to KDnuggets News!