KDnuggets Home » News » 2010 » n06  ( < n05 |  n07 > )

KDnuggets™ News 10:n06, Mar 24


 
  
Features (8) | Webcasts (1) | Software (7) | Jobs (4) | Academic (1) | Publications (4) | NewsBriefs (12) | CFP (19) | Quote

Features

  • New Poll: Is anonymization of large datasets still possible? - Mar 23, 2010.
    Recently, Netflix cancelled the 2nd Netflix Prize due to privacy concerns. With so much personal information online, do you think that it is still possible for companies like Netflix to anonymize and release large datasets (for research and competitions)?
  • Poll Results: Sports Analytics Are Useful - Mar 23, 2010.
    Sports analytics merely increase the likelihood of (yet do not guarantee) making a more educated, more informed decision.
  • KDD Cup 2010: Educational Data Mining - Mar 23, 2010.
    This year's challenge asks you to predict student performance on mathematical problems from logs of student interaction with Intelligent Tutoring Systems.
  • KDD-2010 Workshops - calls for papers - Mar 22, 2010.
    KDD-2010 will have 4 full day workshops: Mining and Learning with Graphs; Large-scale Data Mining; Useful Patterns; Social Media Analytics; and 5 half day ones: KDD Cup 2010; BIOKDD10; MDMKDD 2010; ADKDD'10; Human Computation
  • Next Neflix Prize cancelled due to privacy concerns - Mar 22, 2010.
    After FTC expressed concerns about Netflix members privacy and a lawsuit was filed pertaing to the sequel, Netflix decided to cancel the Netflix Prize sequel
  • Most viewed items for week Mar 14-20 - Mar 21, 2010.
    Top news: Clarabridge Self Service Text Analytics; Stanford online graduate education;; Poll Results: Data Miner Salary by Region. Top jobs: Data Mining Engineer at eBay; Scientist at ID Analytics
  • Most viewed items for week Mar 7-13 - Mar 14, 2010.
    News: Stanford University online graduate education; Poll Results: Data Miner Salary by Region; Jobs: Statistician / Data Analyst at LoopNet, San Francisco, CA
  • Predictive Analytics World: Save-the-Date and Call-for-Speakers - Mar 9, 2010.
    Save-the-date for the next PAW: Oct 19-20, 2010 in Washington DC; Speaker proposals deadline: April 16, 2010

Webcasts (see also All Webcasts)

Software (see also All Software)

  • Data Applied new data mining and visualization capabilities - Mar 23, 2010.
    Founded by ex-Microsoft engineers, the company leverages Silverlight technology and a web-based API to bring data mining within reach of any web-enabled user or application
  • 3rd Annual Rexer Analytics Data Miner Survey - Summary - Mar 19, 2010.
    Most commonly used algorithms are regression, decision trees, and cluster analysis. The top challenges facing data miners are dirty data, explaining data mining to others, and difficult access to data. Users of IBM SPSS Modeler, Statistica, and Rapid Miner are the most satisfied with their software.
  • Weka User Survey - Win Amazon.com Gift Card - Mar 11, 2010.
    U. of Waikato, the home of Weka, is interested to learn more about the people who use Weka Machine Learning Software. Do this 4-minute survey and you can win a $200 Amazon gift card.
  • Two social-bookmarking studies: recommendation and tag-based ranking - Mar 22, 2010.
    The GiveALink.org project invites you to participate in two online user studies related to tag recommendation and tag-based ranking.
  • Pyriel learns classification rules which maximize AUC (open-source) - Mar 18, 2010.
    I previously published a paper "PRIE: A system for generating rulelists to maximize ROC performance" (Data Mining and Knowledge Discovery, October, 2008). I have just released Pyriel, an open-source implementation of this system.
  • Centrifuge Releases New Data Visualization Software - Mar 18, 2010.
    The Centrifuge approach to data visualization brings together three innovations in analysis: Interactive Data Visualization, Unified Data Views and Collaborative Analysis to identify important insights and hidden patterns in your data.
  • Google Public Data Explorer - Mar 11, 2010.
    The new Google Labs tool offers a visual way to look at and analyze large public data sets on a variety of popular search topics.

Jobs (see also All Jobs)

  • Analytics Engineer at Eventbrite, San Francisco, CA - Mar 23, 2010.
    We are looking for a talented software engineer with a solid background in data mining, and knowledge of Internet commerce and social networks.
  • Scientist at ID Analytics, San Diego, CA - Mar 11, 2010.
    part of a team responsible for the development of the company's advanced technologies particularly to do with statistical score modeling, score model development, score model deployment, large scale database analysis and statistical algorithms.
  • Senior Data Engineer at ID Analytics, San Diego, CA - Mar 11, 2010.
    responsible for the extraction, verification, processing, cleansing, analysis and deployment of client data, third-party data sources, and internal data sources.
  • Data Mining Engineer at eBay, San Jose, CA - Mar 10, 2010.
    TnS applications proactively prevent fraud, catch fraud, enforce eBay policies, as well as collect & mine data that will help build future Trust and Safety strategies. We build real time machine learning applications processing 100s of millions of transactions a day, learning from terabytes of historical data.

Academic/Research positions

Publications

  • Why The Next Big Thing Is, In Fact, A Really Big Thing - Mar 21, 2010.
    In my view, big data is the next big thing. I identified five net new possibilities that big data presents: 1. Answer formerly unanswerable questions; 2. New questions; ...
  • Moving On With Analytics - Mar 16, 2010.
    But about that first book, how did it hold up over time? Many speaking engagements later, Davenport sounded just a bit deflated at the overall progress, but not very much surprised.
  • We're so good at medical studies that most of them are wrong - Mar 16, 2010.
    A survey of the recent medical literature found that 95 percent of the results of observational studies on human health had failed replication when tested using a rigorous, double blind trial. Given massive data sets and ability perform multiple tests, many researchers fall into trap of finding "significant" results which are due to random chance.
  • Norman Nie: Open Source is Opening Data to Predictive Analytics - Mar 13, 2010.
    Revolutions in science have often been preceded by revolutions in measurement. Just as the microscope transformed biology by exposing germs, and the electron microscope changed physics, all these data are turning the social sciences upside down.

News Briefs

CFP - Calls for Papers (see also All CFP)

Quote

Empirical evidence is that 80-90% of the claims made by epidemiologists are false; their claims do not replicate when retested under rigorous conditions. The net effect of ignoring multiple testing is to exploit randomness.

S. Stanley Young, National Institute of Statistical Sciences


KDnuggets Home » News » 2010 » n06  ( < n05 |  n07 > )