KDnuggets™ News 13:n24, Oct 8
Features (10) | Software (3) | Webcasts (4) | Courses, Events (4) | Meetings (2) | Jobs (7) | Academic (4) | Competitions (1) | Publications (4) | Tweets (5) | NewsBriefs (8) | CFP (6) | Quote
Features
- Big Data does not mean we have ALL the data (
comments) - Oct 7, 2013.
Does Big Data imply "You have collected all there is - all the data there is about a phenomenon". I strongly disagree with this quote from Viktor Mayer-Schoenberger and Kenneth Cukier book on Big Data - here is my letter to the editor. - Statistical Modeling: The Two Cultures, by Leo Breiman - Oct 1, 2013.
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown - read the full paper.
- SDSC: Supercomputer Data Mining in San Diego (
comments) - Sep 24, 2013.
I talk with Natasha Balac, Director of Predictive Analytics at San Diego Supercomputer Center about supercomputer data mining, Gordon, Hadoop, Data Mining Boot Camps, distinction between Data Science and Data Mining, Big Data hype, and more. - Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF download (draft) - Sep 27, 2013.
New book by Mohammed Zaki and Wagner Meira Jr is a great option for teaching a course in data mining or data science. It covers both fundamental and advanced data mining topics, emphasizing the mathematical foundations and the algorithms, includes exercises for each chapter, and provides data, slides and other supplementary material on the companion website.
- To Hadoop or Not to Hadoop? - Oct 4, 2013.
Hadoop is very popular, but is not a solution for all Big Data cases. Here are the questions to ask to determine if Hadoop is right for your problem.
- U. Chicago MS in Analytics - Information Sessions, Oct 9 and 16. - Oct 1, 2013.
U. of Chicago Graham School Master of Science in Analytics - join us for an Information Session, Oct 9 online, Oct 16 at Gleacher Center.
- Top news for Sep 29 - Oct 5: Fundamental Concepts and Algorithms, free book (draft) download; Statistical modeling: 2 cultures - Oct 6, 2013.
Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF download (draft); Statistical Modeling: The Two Cultures, by Leo Breiman; To Hadoop or Not to Hadoop?
Top jobs: Sr. Data Mining Analyst at Genworth Financial, Richmond, VA; Data Mining Scientist at Apple, Austin, TX; - Top news, jobs in September: Cartoon: Next Trend after Big Data; Poll results: Has Big Data Reached the Hype Peak? - Oct 1, 2013.
KDnuggets Cartoon: Next Trend after Big Data; New Poll: Has Big Data Reached the Hype Peak and is due for Decline and Disillusionment?; edX: Learning from Data, free online course
Top jobs: Data Mining Scientist at Apple, Austin, TX; Machine Learning Scientists at Amazon, Bangalore, India; - Additions to KDnuggets Directory in September - Oct 1, 2013.
Luminoso, Quadbase, Skyttle, Vitria, Zoomdata, and more companies, datasets, education, Big Data and Analytics meetings, software, and solutions added to KDnuggets.
- Top news for Sep 22-28: Cartoon: Next Trend after Big Data; Data Mining and Analysis, free book (draft) download - Sep 29, 2013.
KDnuggets Cartoon: Next Trend after Big Data; Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF download (draft); edX: Learning from Data, free online course
Top jobs: Data Analytics and Optimization at ExxonMobil; Data Scientist at dunnhumby
Software
- Rexer Analytics 2013 Data Miner Survey Highlights (
comments) - Oct 5, 2013.
Top 5 most used tools were R (used by 70% of data miners), IBM SPSS Statistics, Rapid Miner, SAS, and Weka, while STATISTICA, KNIME, SAS JMP, IBM SPSS Modeler, and RapidMiner had the the highest satisfaction. Big Data is actually used only in a small fraction of projects. - DataDealer: Serious Gaming with Data Privacy - Oct 7, 2013.
Can we educate people about privacy via gaming? DataDealer is an award-winning new game, where a consumer manages the privacy of other people and organizations.
- IMARS: IBM Multimedia Analysis and Retrieval System - Oct 1, 2013.
IMARS is a desktop system for automatic indexing, classification, and searching of large collections of digital images and videos.
Webcasts
- Upcoming October Webcasts on Analytics, Big Data, Data Mining, Data Science - Oct 2, 2013.
Check A Primer on Predictive Analytics for Business, Data Discovery Platforms, Data Mining: Failure to Launch, and Data Science: Not Just for Big Data - with Gregory Piatetsky and David Smith
- Webcast, a Primer on Predictive Analytics for Business, Oct 8 - Oct 7, 2013.
Watch a Primer on Predictive Analytics for Business webinar (Oct 8) and learn how affordable CUNY Online MS in Data Analytics can help you excel in data science.
- Modeling Analytics Dream Teams, on-demand webinar - Sep 30, 2013.
Watch this on-demand webinar with top data scientists John Elder and Dean Abbott to learn how an organization can use analytics talent successfully to become an analytics competitor.
- Webcast - Analytically Speaking featuring John Pullinger - Sep 30, 2013.
We can't imagine a more esteemed thought leader to talk about statistics than John Pullinger, the President of the Royal Statistical Society. Listen to his webcast on Nov 13.
Courses, Events
- Q4 2013 Courses on Predictive Analytics, Big Data, Data Mining, and Data Science - Oct 2, 2013.
Many great courses, including Text Analytics and Sentiment Mining, Data Mining: Principles and Best Practices, Supercomputer Data Mining Boot Camp, Survival Analysis, Net lift (Uplift) models, Machine Learning, and Predictive Analytics and Data Mining Model Development and Strategic Implementation.
- edX: Learning from Data, free online course - Sep 27, 2013.
Introductory Machine Learning course covering theory, algorithms and applications, with a focus is on real understanding, taught by a top Caltech Yaser Professor Abu-Mostafa, starts Sep 30.
- Analytics training you wish you got in Biz School - Sep 24, 2013.
Advanced Analytics for the Modern Business Analyst is a 5-week dynamic, instructor-led Live Web class that teaches how to solve problems with data analysis - like that extra semester of graduate training you wish had.
- Seeking experts, instructors for Online Data Analytics Program - Oct 6, 2013.
Davenport University seeks subject matter experts and instructors for the Online Data Analytics Program. Instructors should be familiar with WEKA and willing to teach Online.
Meetings
- KDD 2013 videolectures: watch the top researchers in Data Mining and Knowledge Discovery - Oct 1, 2013.
A treasure of latest Data Mining and Data Science research is now available, with videolectures of KDD-2013, ACM SIGKDD Conference on Knowledge Discovery and Data Mining held recently in Chicago.
- Q4 Meetings in Analytics, Big Data, Data Mining, and Data Science - Oct 3, 2013.
Many interesting upcoming meetings in Q4 2013, including Discovery Science, IEEE Big Data, ACM Mining Big Data Camp, Big Data Techcon, SAS Analytics 2013, PAW London, Strata + Hadoop World NYC, AusDM, Big Data Festival, Text Analytics Summit West, ICDM 2013, Toronto Data Marketing Conference, and many more.
Jobs
- Comprehensive Health Planner II at State of Maine, Department of Health and Human Services, Augusta, ME - Oct 3, 2013.
Complete data analysis of program information, create standardized and ad hoc reports and queries from multiple data sources, and develop and maintain the Division's website. Apply by Oct 18.
- Software Engineer, financial applications at Thasos, Cambridge, MA - Oct 2, 2013.
Thasos, founded by top scientists from MIT Media Lab and Sense Networks, combines and analyzes non-financial big data sources in order to measure real-time company fundamentals and macro-economic developments.
- Big Data Architect at Thasos, Cambridge, MA - Oct 2, 2013.
Thasos, founded by top scientists from MIT Media Lab and Sense Networks, combines and analyzes non-financial big data sources in order to measure real-time company fundamentals and macro-economic developments.- Data Mining Scientist at Apple, Austin, TX - Oct 2, 2013.
Apple Data Mining Lab looks for an outstanding data scientist to to design, develop, and field data mining solutions with direct and measurable impact.- Senior Data Scientist at ISO Innovative Analytics, a unit of Verisk, San Francisco, CA - Sep 30, 2013.
ISO Innovative Analytics develops advanced analytical solutions to challenging problems related to risk and decision-making for P&C, healthcare, and finance. Join us and help develop our next generation solutions.- Sr. Data Mining Analyst at Genworth Financial, Richmond, VA - Sep 29, 2013.
Applying advanced statistical techniques to build predictive models that model customer behavior to improve the effectiveness of both direct mail and digital marketing campaigns.- Data Analytics and Optimization at ExxonMobil, Clinton, NJ - Sep 25, 2013.
Join a dynamic group of scientists performing breakthrough research in machine learning, data-mining and mathematical optimization to solve our most challenging problems.Academic/Research positions
- Open Rank Faculty of Business Practice at University of Miami, School of Business Administration, Coral Gables, FL - Oct 4, 2013.
We are particularly interested in individuals who have extensive experiences in business and marketing analytics, data visualization, data mining and machine learning.
- Faculty, Data Analytics at UTexas, Austin, TX - Oct 3, 2013.
The School of Information at U. of Texas at Austin looks for full-time, tenure-track junior and senior faculty, especially in the areas of data analytics, human-computer interaction, and archival studies.
- Operations and Information Management Faculty at The Wharton School, U. of Pennsylvania, Philadelphia, PA - Sep 26, 2013.
Seeking applicants for a full-time, tenure-track faculty position at any level, with focus on in decision-making, information technology, information-based strategy, operations management, and operations research.
- Postdoc in machine learning/data mining/personalization at U. California, Riverside, Riverside, CA - Sep 24, 2013.
In collaboration with Dr. Michael Pazzani, the postdoc will be conducting research on recommendation systems & personalization in the mobile context and/or data mining of spatial temporal, textual, or multi-media data.
Competitions
- LSHTC4: Large Scale Hierarchical Text classification - Oct 7, 2013.
This challenge has three tracks and is based on two very large, multi-class, multi-label and hierarchical datasets created from the ODP web directory (DMOZ) and Wikipedia.
Publications
- Circle of Trust and Google Plus - Oct 6, 2013.
Circle of Trust measures how asymmetric is your Google+ relationship network, and we give you ways to visualize it with D3.
- CIO Review 20 Most Promising Big Data Companies (
comments) - Oct 5, 2013.
CIO Review selects 20 most promising Big Data companies, from Actian to Zementis, that have achieved significant momentum and will rise above the rest. - Birthdate of Predictive Analytics (
comments) - Sep 26, 2013.
When is the Birthdate of Predictive Analytics? Is it Jan 15, 2003 in SPSS Presentation or earlier? - Social Media Analytics, free e-book - Oct 3, 2013.
This book gives Social Media Analytics overview, techniques, theory and applications, examines the economic impact of social networks and appropriate analytics methods.
Top Tweets
- Top KDnuggets tweets, Oct 4-6: Data science source code snippets; To Hadoop or Not to Hadoop? - Oct 7, 2013.
Sample source code for various data science tasks and projects; To Hadoop or Not to Hadoop? Questions to determine if you need Hadoop; Big Data experts get big salaries - $115K on average; Data Mining reveals the emotional differences in emails written by Men and Women
- Top KDnuggets tweets, Oct 1-3: Data Science with R: Rattle; KDD 2013 videolectures - Oct 4, 2013.
Data Science with R: Getting Started with Rattle - a survival guide; KDD 2013 videolectures: the top researchers in Data Mining, Data Science; Statistical Modeling: The Two Cultures, by Leo Breiman; Social Media Analytics, free e-book, an overview of theory, applications, and economics
- Top KDnuggets tweets, Sep 27-30: Data Mining and Analysis, free book (draft) download; Random Forests Algorithm - why it works - Oct 1, 2013.
New Book: Data Mining and Analysis: Fundamental Concepts and Algorithms, free PDF dow; Random Forests Algorithm - what is it, why does it work so well; Penn researchers use Facebook data to predict users age, gender, personality; Google Hummingbird is a completely new search algorithm and incredibly no one noticed
- Top KDnuggets tweets, Sep 25-26: ganalytics: Google Analytics with R; Getting a Free Data Science Education - Sep 27, 2013.
ganalytics: Google Analytics with R and tips to newcomers to help succeed in using R; Getting a Free Data Science Education, from Coursera, Caltech and MIT, plus free textbooks; Who says #BigData is not useful? Stanford researchers map user experience, beer preference; Nate Silver on Finding a Mentor, Not Settling in Your Career
- Top KDnuggets tweets, Sep 23-24: Machine Learning to create video summaries; Google replaces MySQL with MariaDB - Sep 25, 2013.
Wow! New research uses Machine Learning to provide brief video digest; Google moves to MariaDB to replace MySQL (still open-source, but Oracle-controlled); Yottamine sets #BigData Machine Learning speed record with 640 cores; Facebook chases Google Deep Learning with new research group
News Briefs
- September Analytics, Big Data, Data Mining companies and startups - Oct 5, 2013.
The September 2013 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: SAP buys KXEN, Rocket Fuel IPO, Clarabridge Indisys, Narrative Science, Practice Fusion and more.
- WibiData releases Kiji Chirashi framework for Big Data Applications - Oct 4, 2013.
Kiji is an open source framework for building big data apps with Apache HBase, launched by WibiData to fill the gap between a key-value store functionality and the needs of a predictive modeling application.
- Colombia OHCHR RFP: Pilot Text Mining Project - Oct 3, 2013.
Colombian UN Office for Human Rights is looking for proposals for a pilot project using text-mining software to effectively explore and analyze scanned text documents.
- Predixion Launches OEM Predictive Analytics Program - Sep 30, 2013.
Predixion enterprise-class, extensible, predictive analytics platform was designed to support the needs of an embedded customer, from Excel-based modeling environment, to one button Insight Now predictions.
- Datameer Smart Analytics for Hadoop - Sep 30, 2013.
New self-service data mining functionality lets business users find patterns and relationships in their data without a data scientist.
- Explore Analytics offers Data Visualization in the cloud - Sep 27, 2013.
Explore Analytics is a self-service BI tool that lets you access, explore, analyze, and instantly create interactive visualizations and dashboards to share with your team.
- BigML machine learning platform Fall 2013 release - Sep 26, 2013.
This release is a substantial update to our hosted machine learning platform, with advanced text analysis, Microsoft Excel Export, Multi-label Classifications, BigML PredictServer, better UI and workflow, and many other improvements.
- Skytree Launches Second Opinion Predictive Analytics Program - Sep 24, 2013.
Now in beta, the Second Opinion program pairs data scientists with Skytree experts to ensure maximum accuracy and performance, and gives data scientists the means to test their own predictive analytics programs using unique and cutting-edge machine learning methods.
CFP - Calls for Papers
- HI 2013: 1st Workshop on Histoinformatics, due Oct 6
- SBP14: Social Computing, Behavioral-Cultural Modeling, & Prediction, due Nov 8
- DBKDA 2014: Advances in Databases, Knowledge, and Data Applications, due Nov 28
- EJCS-DM: European J of Cultural Studies special issue on Data Mining / Analytics, due Dec 9
- ECSM 2014: European Conf. on Social Media, due Dec 19
- PILOT: Privacy and Security in Location-based Social Networks, special issue of Transactions on Data Privacy, due Dec 31
Quote
"You have collected all there is - all the data there is about a phenomenon", write Viktor Mayer-Schoenberger and Kenneth Cukier in their book on Big Data. I strongly disagree. Gregory Piatetsky
- Big Data Architect at Thasos, Cambridge, MA - Oct 2, 2013.