- New Book: Data Mining for Business Applications - Dec 29, 2010.
This book contains extended versions of workshop papers from 2005 to 2008 on data mining for business applications. Areas covered include methodological issues and research challenges, typical problems, and the emerging applications.
- Exploring Twitter Hashtags - Dec 29, 2010.
Using a dataset of 29 million messages, Jan Poeschko explores relations among the hashtags with respect to co-occurrences. He classifies hashtags into five intuitive classes, using a machine-learning approach.
- The Truth Wears Off - Dec 29, 2010.
Many rigourously proved scientific results start shrinking in later studies. What went wrong? (My guess - widespread data overfitting and confirmation bias).
- Xindong Wu: 10 Years of Data Mining Research (ICDM'10 Keynote) - Dec 23, 2010.
ICDM'10 keynote reviewed past activities, discussed current achievements, and presented research challenges for the future.
- Book: Reactive Business Intelligence - Dec 22, 2010.
Combining data mining, modelling and visualization (based on authors' Grapheur software) this book would be of interest to analytic professionals.
- Much Faster Bootstraps Using SAS® - Dec 22, 2010.
We compare 7 bootstrap algorithms in SAS; our best one is ~80x faster than the built-in SAS procedure (Proc SurveySelect).
- Wanted: Data Scientists to Turn Information Into Gold - Dec 22, 2010.
there was a 200 percent increase from 2008 to today in searches for executives with sophisticated data mining or data analytics capabilities.
- KDnuggets 10:n30, Top Stories of 2010; $3M Health Data Analysis Prize; Typical Analytics Computer - Dec 21, 2010.
Latest news on data mining & analytics, including Features (8) | Courses (1) | Software (3) | Jobs (9) | Academic (4) | Meetings (1) | AudioVideo (3) | Publications (9) | NewsBriefs (9) | CFP (11)
- The Originative Statistical Regression Models: Are They Too Old and Untenable? - Dec 20, 2010.
Statistical ordinary least squares (OLS) regression, and logistic regression (LR) models are popular techniques for prediction (of a continuous dependent variable) or classification (of a categorical dependent variable).
- Poll Results: Computer Configuration for Analytics - Dec 20, 2010.
the typical analytics machine has 2 cores, 3 GB memory, 500 GB disk, and ...
- 20% off on Chapman/CRC Books on Data Mining and Knowledge Discovery - Dec 17, 2010.
Get 20% off on Data Mining with R: Learning with Case Studies, Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques, Handbook of Educational Data Mining, or other Chapman/CRC Books purchased on website.
- Google ngrams: In 500 Billion Words, New Window on Culture - Dec 17, 2010.
Google has made a mammoth database culled from nearly 5.2 million digitized books available to the public for free downloads and online searches, opening a new landscape of possibilities for research and education in the humanities.
- Healthcare Predictive Modeling News - Dec 17, 2010.
monthly newsletter focusing on predictive modeling in healthcare, featuring articles, news, key data, technology developments, recent studies, and more.
- Report: Ten Database Activities Enterprises Need to Monitor - Dec 16, 2010.
Most enterprises are paying too little attention to the very real security risks associated with their databases. Auditors, security and risk professionals, and data owners need to watch for telltale behaviors that may indicate serious database security problems.
- Interview with VisiStat Analytics CEO: Jim Bennette - Dec 15, 2010.
Most companies only know what basic web analytics tools such as Google Analytics can do, but there is really so much more that can be done. One of the more interesting developments is in the sophistication of identifying and tracking of anonymous visitors and their behavior.
- Advocacy group urges a broader view of government data mining - Dec 13, 2010.
Many federal efforts that use data mining might be flying under the radar because the law requiring agencies to report on such activities applies a very narrow definition of the practice
- Book: Mining of Massive Datasets (free download) - Dec 11, 2010.
This book was developed over several years teaching a course on Web Mining at Stanford by A. Rajaraman (Kosmix) and J. Ullman (Stanford),
- Data mining depression - Dec 9, 2010.
Information technology could improve prevention, treatment of depression
- KDnuggets 10:n29, Your Analytics Setup? "Do not track" off track; 10 Jobs - Dec 8, 2010.
Latest news on data mining & analytics, including Features (6) | Courses (1) | Webcasts (1) | Software (3) | Jobs (10) | Meetings (2) | AudioVideo (1) | Publications (7) | NewsBriefs (4) | CFP (5)
- How Data Analytics and BI Professionals Used Twitter in November - Dec 7, 2010.
Spotfire blog brings news from experts on how BI "hit the road," a data mining founder and lots of buzz around social BI and mobile BI.
- The Predictive Model: Its Reliability and Validity - Dec 7, 2010.
Bruce Ratner reexamines these concepts so that model builders and marketing users of models know whether their predictive models are meritable.
- Recent data analytics & visualization contests - Dec 6, 2010.
Popular topics include direct marketing, geo-location, navigation, and traffic prediction. Given the increasing demand for data scientists, companies will do well to reach to top contestants.
- Opening My Mind to Open Data - Dec 6, 2010.
Open is suddenly cool because of open source; There is an ongoing movement towards open data; in the UK, US, Canada, Australia, New Zealand, Ireland, Norway and even Kenya. Open is suddenly cool because of open source
- Analyzing Literature by Words and Numbers - Dec 4, 2010.
The titles of every British book published in English in and around the 19th century are analyzed for key words and phrases that might offer fresh insight into the minds of the Victorians.
- Tom Breur on How to build predictive models - Dec 2, 2010.
Much has been written on building predictive models, but this summary by Tom Breur is very clear and concise
- Life-Time Value Modeling of Big-ticket Items - Dec 1, 2010.
LTV modeling of big-ticket items puts forth analytic-tactical issues because the sales patterns are "spotty"