KDnuggets™ News 13:n15, Jun 19
Features (8) | Software (5) | Webcasts (3) | Courses, Events (3) | Meetings (1) | Jobs (8) | Academic (2) | Competitions (3) | Publications (4) | Tweets (6) | NewsBriefs (6) | CFP (11) | Quote
Features
- Reaction to NSA Prism Data Mining: Outrage and Shrug - Jun 17, 2013.
The KDnuggets Poll on NSA PRISM program - collecting and data mining huge amounts of internet data - produced an intense debate and strong reaction, with 66% of European and 56% of US/Canada voters expressing outrage. The rest of the world was mostly not surprised.
- Exclusive: Interview with Dan Steinberg, President of Salford Systems, Data Mining Pioneer - Jun 15, 2013.
My exclusive interview with Dan Steinberg, on CART, MARS, RandomForests, working with Leo Breiman, the origin of Salford Systems, CART vs C4.5, winning competitions, and more. If you miss one revolution you only needed to wait a bit to catch the next one!
- Exclusive: Part 2 of my Interview with Dan Steinberg, President of Salford Systems, Data Mining pioneer - Jun 15, 2013.
Part 2 of my exclusive interview with Dan Steinberg, on CART, MARS, RandomForests, working with Leo Breiman, winning competitions, On Big Data, and advice to aspiring data scientists. If you missed one revolution wait a bit to catch the next one!
- PAW Boston, Join Analytics Experts, Sep 29 - Oct 3 - Jun 18, 2013.
Join Predictive Analytics experts from leading organizations at PAW Boston, Sep 29 - Oct 3. Keynotes from hot analytics speakers, case studies from top organizations, and workshops for all levels. Best rates thru June 22 and special KDnuggets discount.
- Import.io easy visual download and import web data - Jun 9, 2013.
Web was designed for documents, not for data, and Import.io wants to remedy this. I spoke to Import.io founder about what they do, and how Import.io lets you download the web data in an easy and visual way.
- Lavastorm and Forrester: Build an Agile BI Organization - Jun 11, 2013.
Learn how to Build an Agile BI Organization with this report from Forrester Research, brought to you by Lavastorm, and see 3 key points about Agile BI.
- Top news for Jun 9-15: RapidMiner and R vie for first place; Simple Decision Tree Excel Add-in; Import.io for web data - Jun 16, 2013.
KDnuggets Annual Software Poll: RapidMiner and R vie for first place; Simple Decision Tree Excel Add-in; Import.io for easy visual web data download
Top jobs: PhD position on Machine Learning for Gait Analysis in Geneva; Data Scientist at Civis Analytics, Chicago - Top news for Jun 2-8: KDnuggets Poll: RapidMiner and R vie for first place; Very Fast Sampling for Big Data - Jun 9, 2013.
KDnuggets Annual Software Poll: RapidMiner and R vie for first place; Very Fast Sampling Algorithms for Big Data; 42 Big Data Startups;
Top jobs: Credit Risk Manager at Axcess Financial; Doctoral scholarship in machine learning at TU-Darmstadt-DIPF
Software
- The Risks of Using Spreadsheets for Statistical Analysis - Jun 13, 2013.
Are spreadsheets more hindrance than help in data analysis? Download "The Risks of Using Spreadsheets for Statistical Analysis" white paper and discover how predictive analytics can produce faster, more accurate results.
- AbsolutData, a midsize competitor in Analytics Services market - Jun 12, 2013.
I talked to AbsolutData CEO Dr. Anil Kaul about their Analytics Services, what makes them different from the competition, and current trends in the analytics marketplace.
- Oracle BigDataLite for getting started with Big Data - Jun 11, 2013.
Oracle BigDataLite Virtual Machine provides an integrated environment to help you get started with the Oracle Big Data platform. This appliance is for testing and educational purposes only.
- Simple Decision Tree Excel Add-in - Jun 9, 2013.
Simple Decision Tree is open-source Excel add-in, which has been extensively used to teach Decision Analysis at Stanford University.
- Very Fast Sampling Algorithms for Big Data - Jun 5, 2013.
New sampling, bootstrap, and permutation test algorithms which are orders of magnitude faster than built-in SAS Procs, Stata, and MATLAB.
Webcasts
- Upcoming Webcasts on Analytics, Big Data, Data Mining - Jun 18, 2013.
Coming soon: Webcasts on Social Data, Productionizing Hadoop, High-Performance Data Mining, Big Data Analytics, Customer Centricity, and more.
- Webinar, June 20: High-Performance Data Mining Using SAS Enterprise Miner - Jun 13, 2013.
SAS Enterprise Miner streamlines the data mining process to create highly accurate predictive and descriptive models. We explore the new high-performance data mining functionality added to SAS Enterprise Miner 12.3.
- Webinar: Data Mining: Failure to Launch [July 18] - Jun 16, 2013.
Learn how to get started with predictive modeling and overcome strategic and tactical limitations that cause data mining projects to fall short of their potential. Next webinar is July 18.
Courses, Events
- Web course: Advanced Analytics for the Modern Business Analyst - Jun 18, 2013.
Taught over a five-week period, this course combines scheduled, instructor-led Live Web sessions with assignments, hands-on exercises, and online discussion forums. Next Live web session starts July 15.
- TMA Courses in Data Analytics [Aug: San Jose, Sep: Washington, DC] - Jun 16, 2013.
Get up to speed in data mining faster and more effectively than with any other training program available. Next courses in San Jose, CA and Washington, DC.
- Brazilian Summer School on Machine Learning and Knowledge Discovery in Databases (MLKDD) - Jun 14, 2013.
The summer school will provide short courses in current research issues in machine learning and data mining from leading researchers. Register early - enrollment is limited!
Meetings
- KNIME User Day, London, UK, June 25 - Jun 11, 2013.
KNIME is the open-source data analytics platform, with a modern data-flow oriented UI which supports data preparation, analytics, data mining, sentiment and network analysis, and more. First UK user day is June 25 in London. Attendance is free but limited.
Jobs
- Data Scientists (all levels), Amazon Consumer Analytics at Amazon, Seattle, WA - Jun 18, 2013.
This is one of the most exciting machine learning job opportunities today. If you have deep machine learning knowledge, know how to deliver highly innovative solutions to challenging problems, we want to talk to you.
- Data Analyst at AllOut, New York, NY - Jun 17, 2013.
All Out is mobilizing millions of people and their social networks to build a powerful global movement for love and equality. Data Analyst will work on improving campaigning using scientific measurement and right metrics.
- Data Scientist at CivisAnalytics, Chicago, IL - Jun 13, 2013.
Join the team that built Obama for America Analytics. Work closely and collaboratively with analysts and engineers to identify, quantify and solve big, meaningful problems.
- Data Scientist at Pioneer, Johnston, IA - Jun 8, 2013.
Expert on machine learning and data mining to join a new team using big data analytic methods for creation and delivery of innovative, customer-focused agronomic services.
- Sr Contract Monitoring Analyst-Medicare Part D at Blue Cross Blue Shield of Rhode Island, Providence, RI - Jun 6, 2013.
Conduct pharmacy claim and data analyses to monitor Medicare Part D obligations, claim adjudication accuracy and ensure CMS requirements are being met, extract and analyze pharmacy claims.
- Sr. Capability Implementation Manager, Machine Learning at Educational Testing Service (ETS), Princeton, NJ - Jun 6, 2013.
Update frameworks for ETS capabilities based on recent developments in machine learning, lead design and implementation processes for transition of research to operations.
- Staff Data Scientist, Forecasting and Demand Analysis at Chegg, Santa Clara, CA - Jun 5, 2013.
Your goal - to improve the education process and better the lives of students, through data and Analytics.
- Who dominates analytics job market? - Jun 10, 2013.
Peter Bruce reexamines who dominates analytics job market and comes up with a different answer. His analysis shows 1.92 SAS jobs for every R job, but SQL and Java are the skills most in demand.
Academic/Research positions
- Postdoc position in data-mining / chemoinformatics at GREYC, University of Caen, France, Caen, Normandy, France - Jun 16, 2013.
A 12-month post-doctoral position is open for studying computational methods for chemoinformatics, as part of our larger activity on designing efficient SAR methodologies.
- PhD position on Machine Learning for Gait Analysis at U. Geneva and HUG-GE, Geneva, Switzerland - Jun 12, 2013.
Looking for an excellent PhD candidate to work on the development of new machine learning methods for the modelling of pathological gait to better understand gait deviations and contribute to improve treatment strategies. Apply by July 10.
Competitions
- VAST Challenge 2013: Predicting movie box office openings and movie ratings - Jun 14, 2013.
This challenge focuses on predicting movie box-office openings and movie ratings. Submission deadline is July 8, 2013.
- Academics Data Mining Video Contest - Jun 12, 2013.
Help us educate future data scientists - submit a video (10 mins or less) showing how you are using the Salford SPM software suite on real-world data. Registrants have free access to software for 6 months.
- NineSigma/Philips: Algorithms to Recognize Faces, Simulate Facial Hair - Jun 9, 2013.
Requesting proposals for algorithms that isolate an image of a person and digitally simulate different facial hairstyles on the face. Submissions due July 3.
Publications
- CRCPress: New Books on Business Analytics, Graph Mining, Data Science - Jun 7, 2013.
Latest books from CRC Press on important and relevant topics - Business Analytics, Intelligent Data Analysis, Graph Mining with R, and Data-Intensive Science. Special discount for KDnuggets readers.
- LIONBook: free online book on Machine Learning and Intelligent Optimization - Jun 18, 2013.
This book, from the developers of LionSolver, will be freely available on the web, chapter by chapter. Learning and Intelligent Optimization (LION) is the combination of learning from data and optimization applied to solve complex and dynamic problems.
- Big Data Journal: Broad Data, Apache Drill, and more Fast Track Articles - Jun 17, 2013.
Big Data is the first peer-reviewed journal to address challenges and opportunities in making substantial amounts of information work to benefit society, industry, academia, and government. Check these new articles on Fast Track.
- Security Informatics most viewed articles - open access - Jun 10, 2013.
Knowledge encapsulation framework for technosocial predictive modeling, Predicting sentencing outcomes, Detecting unknown malicious code, and more open access articles from Security informatics.
Top Tweets
- Top KDnuggets tweets, Jun 14-16: NSA Announces Free Data Storage and Backup Service :); Top 100 R packages for 2013 - Jun 17, 2013.
NSA Announces Free Data Storage and Backup Service for Americans; Top 100 R packages for 2013, their family trees; Most popular R packages Ex-Facebook employees launch data visualization startup Polychart
- Top KDnuggets tweets, Jun 12-13: Practical Data Science with R ebook; Rmagic interface bridging Python and R - Jun 14, 2013.
Practical Data Science with R ebook, from basic principles to real cases; Rmagic, a handy interface bridging Python and R - ipython extension; Deep Learning comes of age - an overview of the Machine Learning breakthrough; Baidu catches up to Google with new Visual search powered by deep learning
- Top KDnuggets tweets, Jun 10-11: Tutorial on Deep Learning, a key breakthrough in ML; Random Forests in Python - Jun 12, 2013.
Tutorial on Deep Learning, a key breakthrough in Machine Learning; Random Forests in Python; Walmart Labs buys Inkiru, a Data Analytics, Predictive Intelligence Startup; Great for learning! Oracle BigDataLite Virtual Machine
- Top KDnuggets tweets, June 7-9: Statisticians are the modern explorers; Import.io, an easy and visual way to import data from the web - Jun 10, 2013.
Statisticians are the modern explorers, says Prof. David J. Hand; Import.io gives you an easy and visual way to download and import web dat; Simple Decision Tree is an open-source Excel Add-in; The Right Skillset for #BigData Jobs - top terms in job ads
- Top KDnuggets tweets, June 5-6: Very fast sampling algorithms for Big Data; Do You Have What It Takes To be a Data scientist? - Jun 7, 2013.
Very fast sampling algorithms for Big Data; Jigsaw Academy 10 question test: Do You Have What It Takes To Be a Data Scientist? The CIA In-Q-Tel arm invests in Narrative Science for its AI, analytics technology; NSA gets ALL Verizon call logs (who calls whom), but NO content.
- Top KDnuggets tweets, June 3-4: 2 Great Machine Learning ideas combined; Upcoming Meetings, Conferences on Analytics, Big Data, Data Mining - Jun 5, 2013.
2 great machine learning ideas combined! Deep Learning can be improved using SVM ; Upcoming Meetings, Conferences on Analytics, Big Data, Data Mining, and KDD; Twitter analysis quantifies Geek vs Nerd; Upcoming Webinars and Courses on Analytics, Data Mining, and Data Science
News Briefs
- Heritage Health 500K Prize awarded; HHP Prize 2 will be $3M "masters" competition with real, not anonymized data - Jun 14, 2013.
POWERDOT, a team of former rivals, won $500K for 1st place in Heritage Health Prize. As a followup, HPN is launching a $3 million private "masters" competition, open to the top finishers from the first prize. The second prize will use actual health care data, with little or no anonymization.
- Angoss acquired by Peterson Partners for $8.4 million - Jun 12, 2013.
Long-time analytics and data mining software developer Angoss Software acquired by a Utah-based private equity firm.
- Hitachi Global Center for Innovative Analytics (HGC-IA) - Jun 8, 2013.
Hitachi Global Center for Innovative Analytics (HGC-IA) will link Hitachi Labs and Business Units in US, Europe and Asia and will focus on Big Data and Analytics solutios in Health Care, Communications and Media, Energy, Transportation, and Mining.
- PRISM: NSA data mining 9 leading Internet firms - Jun 7, 2013.
Washington Post and Guardian revealed huge NSA data mining program which gets information directly from servers of nine leading US internet companies: Microsoft, Yahoo, Google, Facebook, PalTalk, AOL, Skype, YouTube, and Apple.
- Data Tamer startup from Michael Stonebraker, Still in Stealth Mode - Jun 6, 2013.
Data-Tamer is the latest(?) startup founded by legendary DB researcher and serial enterpreneur Michael Stonebraker. It is in stealth mode and says only it plans to "enable organizations to broadly integrate and curate many existing and future data sources efficiently at scale".
- Automated Data Scientist? IBM SPSS Analytic Catalyst - Jun 6, 2013.
IBM SPSS Analytic Catalyst is designed to allows non-statisticians to discover statistically interesting results in big data. It is a potentially significant milestone in making data analysis more widely available.
CFP - Calls for Papers
- ICDM'13: IEEE International Conference on Data Mining, due Jun 21
- IEEE BigData 2013: 2013 IEEE Int. Conf. on Big Data , due Jun 23
- ACM SIGSPATIAL 2013: ACM SIGSPATIAL Int. Conf. on Advances in Geographic Information Systems, due Jun 24
- DMoLD'13: Data Mining on Linked Data workshop, challenge, due Jun 28
- CI 2013: Climate Informatics , due Jul 8
- RepSys2013 : Reproducibility and Replication in Recommender Systems Evaluation, due Jul 22
- DaMNet 2013: Data Mining in Networks, due Aug 3
- DMPE: Data Mining in Pervasive Environments, due Sep 15
- HEALTHINF 2014: Health Informatics, due Sep 19
- SDM'14: SIAM International Conference On Data Mining, due Oct 6
- AISTATS 2014: 17th Int. Conf. on Artificial Intelligence and Statistics , due Nov 1
Quote
I can't express how infuriated I am that my credit history, phone activity, and online browsing habits are being systematically collected and archived without my knowledge by undisclosed organizations that aren't trying to sell me products," said the visibly disturbed man,