KDnuggets™ News 14:n01, Jan 8
Features (13) | Software (2) | Courses, Events (1) | Meetings (4) | Jobs (5) | Academic (1) | Competitions (1) | Publications (7) | Tweets (6) | NewsBriefs (3) | CFP (14) | Quote
- New Poll: Data Science Skills - Individual vs Team Approach ( comments) - Jan 7, 2014.
Data Science positions and tasks need a rare combination of Statistics, Hacking, Database, Business, and other skills. New KDnuggets Poll is asking which approach is better for filling Data Science Positions - Individual or Team?
- Top Datasets on Reddit ( comments) - Dec 28, 2013.
Most popular datasets on Reddit include NFL Game Metadata, Reddit top 2.5 Million posts, Zillow housing prices, and, of course, a database of cat pictures.
- "Data Scientist" catches "Statistician", surpasses "Data Miner" ( comments) - Dec 22, 2013.
The rapidly rising term "Data Scientist" caught up with "Statistician" and surpassed "Data Miner" on Google Trends. However, Statistics remains a lot more popular than "Data Science", which begs the question: What do Data Scientists do? Clearly, it is not Data Science.
- PAW: Predictive Analytics World for Manufacturing, Chicago, June 17-18 - Jan 7, 2014.
The Predictive Analytics World for Manufacturing conference will focus on several different application areas of manufacturing where predictive analytics and big data have proven to be game changers.
- Unicorn Data Scientists vs Data Science Teams ( comments) - Dec 30, 2013.
A recent post has generated an intense discussion about finding "unicorn" data scientists with a combination of all the needed skills, or whether that skillset is best filled by a team. Here are the highlights, including a proposal how to train well-rounded data scientists.
- Biernbaum: Data Science is going 99% too fast ( comments) - Jan 3, 2014.
Currently Data Science (and Big Data) is a train that is going 99% faster than the rails can support. It can derail and become a failed fad. To succeed, Data Science and Big Data need to specialize.
- What is Wrong with the Definition of Data Science ( comments) - Dec 24, 2013.
A veteran statistician argues that 3 different areas usually included in "Data Science" require dramatically different, skills, education, and training with very little overlap.
- DMCS 2013 Data Mining Case Studies Practice Prize Winners - Dec 22, 2013.
DMCS (Data Mining Case Studies) 2013 Practice Prize was awarded at ICDM 2013 conference for a work on a novel and successful credit card fraud detection system, implemented in a Turkish bank. The Prize was partially sponsored by KDnuggets.
- Top stories for Dec 29 - Jan 4: Unicorn Data Scientists vs Data Science Teams; Top Datasets on Reddit - Jan 5, 2014.
Unicorn Data Scientists vs Data Science Teams; Top Datasets on Reddit; A Programmer Guide to Data Mining - Free Download
- Top stories in December: A Programmer Guide to Data Mining - Free Download; 3 Stages of Big Data - Jan 2, 2014.
New Book: A Programmer Guide to Data Mining - Free Download; Open Source Data Science MS Curriculum; 3 Stages of Big Data; Top 2013 LinkedIn Groups; R leading, but Python is gaining
- Additions to KDnuggets in December - Jan 1, 2014.
MicroStrategy Analytics Express and Data Mining Services, Skytree Machine Learning platform, Ubiq Analytics, and more education, meetings, and software.
- Top stories for Dec 22-29: Data Mining Applications with R; "Data Scientist" catches up with "Statistician" - Dec 29, 2013.
Data Mining Applications with R; "Data Scientist" catches up with "Statistician", surpasses "Data Miner"; What is Wrong with the Definition of Data Science;
- Top stories for Dec 15-21: R leading, Python gaining; Top LinkedIn Groups reanalyzed - Dec 22, 2013.
Poll Results: R has a big lead, but Python is gaining; Top 2013 LinkedIn Groups for Analytics, Big Data; Predictive Analytics in 2014: Monetizing, Not Managing Big Data
- MADlib: Big Data Machine Learning in SQL for Data Scientists - Jan 6, 2014.
MADlib is open source with commercially usable BSD license; supports Postgres and Pivotal Greenplum DBMS, and provides classification, regression, clustering, topic modeling and other analytics for Big Data.
- BigML 2014 Winter Release: Faster, Easier, and more Programmatic Machine Learning - Jan 4, 2014.
BigML was used to create over 600,000 predictive models in 2013; Winter 2014 release makes big advances in speed and programmability, and new development mode allows you to run unlimited tasks of up to 16 MB for FREE.
- CUNY Online MS in Data Analytics - Jan 7, 2014.
CUNY SPS offers a fully online and affordable MS in Data Analytics that prepares graduates to manage and analyze Big Data. Learn more in Jan 14 info session.
- INFORMS Business Analytics and Operations Research, Boston, Mar 30 - Apr 1 - Jan 7, 2014.
INFORMS is a top analytics conference featuring over 100 practical talks which provides pointed "how we did it" information from hand-picked industry speakers. Keynote speakers include industry thought leader, Tom Davenport. Sign up now at the super saver rates.
- BigData TechCon: Learn HOW TO Master Big Data, Mar 31-Apr 2, Boston - Jan 6, 2014.
Big Data TechCon, Mar 31-Apr 2, Boston, is the HOW-TO big data event. Use code BIGDATA for $200 discount. Save even more with early bird by Jan 24.
- KNIME User Group Meeting, Zurich, Feb 12-14 - Jan 4, 2014.
This annual event serves an international forum for discussion of KNIME and how it is used in various fields such as business and customer intelligence, analytics and the life sciences. Early bird rates before Jan 27, 2014.
- Jan-Apr Meetings in Analytics, Big Data, Data Mining, and Data Science - Jan 3, 2014.
Upcoming meetings include Oracle BIWA Summit, EGC, Sentiment Analysis Symposium, PAW San Francico, GigaOM Structure Data, INFORMS, SBP, and SIAM Data Mining.
- Data Scientist for Oil and Gas at Ayasdi, Palo Alto, CA - Jan 4, 2014.
Are you a data ninja? Do you dream about solving puzzles at night? Join us and help us solve some of the most vexing problems.
- Health Sys Analyst Programmer II 1311219 at Vanderbilt U. & Medical Center, Nashville, TN - Jan 3, 2014.
Analyze operational processes and information generation or utilization. Develop, test, implement and support computer-based information systems. Assist lower level analyst programmers with technical problems.
- Tower Project Developer Consultant at UNICEF, New York, NY - Dec 21, 2013.
Help with the real-time prototyping of The Tower Project, which monitors and aims to predict natural and man-made disasters by looking at data such as volume of calls and SMS.
- Analytics Senior Manager at Charles Schwab, Englewood, CO or San Francisco, CA - Dec 20, 2013.
Partner with business teams to understand objectives and scope analytical projects that deliver insights and results; work in a cross-functional manner with other consultants, analysts, statisticians, data engineers, and external vendors to deliver insights and solutions.
- Software Engineer - Machine Learning, Data Science at WhitePages, New York, NY - Dec 19, 2013.
Be instrumental in defining, driving and extending the vision for WhitePages data and help identify new ways to improve the value of our data by through freshness, accuracy, breadth, and depth.
- Postdoc positions in Natural Language Processing at KU Leuven, Leuven, Belgium - Dec 28, 2013.
KU Leuven has a postdoc fellowship: Knowledge Acquisition for Automated Natural Language Understanding position, and PhD position: Intelligent Aids for Multilingual Information Processing.
- UMich Competition graduate students / post docs using SEARCH - Dec 19, 2013.
SEARCH is a statistical technique for understanding complex interactions among explanatory variables in describing a wide variety of phenomena. Awards for US grad students/postdocs trying to understand complex interactions in large databases.
- IEEE Rock Stars of Big Data Presentations - Jan 7, 2014.
This event, held at the Computer History Museum in Oct 2013, attracted a sold-out crowd who listened to 9 excellent speakers and leaders in the field - here are the presentations.
- Alpine Data Labs 2014 Predictions - Dec 27, 2013.
Data science is permeating every facet of our daily lives - from our culture to our classrooms. Look for data science to make an even greater impact in 2014.
- Highlights of Data Marketing 2013 Conference in Toronto - Dec 26, 2013.
Key themes were: Customer Obsessed Marketer, Segment of One, SoLoMo (Social, Local and Mobile), and Big Data - actionable insights and decision making.
- AnalyticsWeek 200 Thought Leaders in Big Data and Analytics - Dec 24, 2013.
AnalyticsWeek produces the list of 200 Thought Leaders on Tweeter in Big Data and Analytics, which includes the usual suspects but also new names.
- New book: Data Mining Applications with R - Dec 23, 2013.
Covers 15 real-world applications on data mining with R, including R code and data, covering business background and problems, data extraction and exploration, data preprocessing, modeling, model evaluation, findings and model deployment.
- Vasant Dhar on "Data Science and Prediction" ( comments) - Dec 21, 2013.
What does "Data Science" and #BigData mean? Is there something unique about it? What skills do "data scientists" need to be productive in a world deluged by data? What are the implications for scientific inquiry?
- FICO Lessons in Developing, Applying Decision Modelling Methods - Dec 21, 2013.
Analytically sophisticated businesses combine predictive analytics and decision models with optimization to solve complex problems and achieve good results. Top FICO expert explains.
- Top KDnuggets tweets, Jan 3-5: How to build a successful Data Science team; Netflix reverse engineered Hollywood - Jan 6, 2014.
How to build a successful #DataScience team; Netflix reverse engineered Hollywood to understand how people look for movies; 6000 Companies Hiring Data Scientists; An academic paper on how to search the Internet for evidence of time travel
- Top KDnuggets tweets, Dec 27 - Jan 2: Oryx: Simple large-scale machine learning; Sad state of sentiment analysis - Jan 3, 2014.
Oryx on GitHub: Simple real-time large-scale machine learning infrastructure; Sad state of sentiment analysis: LESS accurate than a coin toss? Practical Tools for Exploring Data and Models, by R wizard @hadleywickham; Top stories in December
- Top KDnuggets tweets, Dec 25-26: The emergence of Apache Spark; 5 Free Excel add-Ins for #BigData - Dec 27, 2013.
The emergence of Apache Spark is a key development for Big Analytics; 5 Free Excel add-Ins to help Marketers analyze #BigData; Key Skills of Top @kaggle Competitors: R (90%), Random Forests (60%); Netflix open sources Suro: data traffic "cop" which directs #BigData to destination
- Top KDnuggets tweets, Dec 23-24: New book: Data Mining Applications with R; Data Scientist catches up with Statistician - Dec 25, 2013.
New book: Data Mining Applications with R; Data Scientist catches up with Statistician; What is Wrong with the Definition of Data Science; Making sense of #BigData : mining Twitter names
- Top KDnuggets tweets, Dec 20-22: Data Mining Book Review: "Visualize This"; Top NYU Prof. on Data Science and Prediction - Dec 23, 2013.
Data Mining Book Review: "Visualize This" from @flowingdata; Top NYU Professor Vasant Dhar on Data Science and Prediction - what do they mean; Analysis reveals #MOOC problems: student participation drops dramatically.
- Top KDnuggets tweets, Dec 18-19: Poll Results: R has a big lead, Python is gaining; Who are Data Scientists? - Dec 20, 2013.
Poll Results: R has a big lead, but Python is gaining; Who are Data Scientists and why they are or are not unicorns; 2014 Predictions: Machine-generated data will grow; #BigData + Big Pharma = Big Privacy Catastrophe
- December Analytics, Big Data, Data Mining companies and startups activity - Jan 7, 2014.
December 2013 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: Talend, Palantir, KPMG, Datameer, Dell, Takadu.
- LinkedIn Hottest Skills of 2013 - Dec 20, 2013.
LinkedIn Hottest Skills of 2013 include Statistical Analysis and Data mining, Perl/Python/Ruby, Business Intelligence and several related ones..
- Big Data In 2014: 6 Bold Predictions - Dec 25, 2013.
New bold predictions include: More Hadoop projects will fail than succeed, The need for automated tools will become critical, and #BigData will fly to the cloud.
CFP - Calls for Papers
- WSDM-WS: Web-Scale Classification: Classifying Big Data from the Web , due Jan 6
- BGM 2014: WWW 2014 workshop on Big Graph Mining, due Jan 7
- HL 2014 : Heterogeneous Learning, due Jan 10
- WebSci 14 W/T: ACM Web Science Conf. Workshop and Tutorial Proposals, due Jan 17
- ESaaSA: An international workshop on Emerging Software as a Service and Analytics on Cloud Computing. , due Jan 21
- MLDAS: Machine Learning and Data Analytics Symposium, due Jan 24
- BDIFORS: Buisness Analytics Optimization and Big Data, due Jan 31
- CaRR: 4th Workshop on Context-awareness in Retrieval and Recommendation, due Feb 10
- WebSci 14: ACM Web Science Conf. , due Feb 23
- IMMM 2014: Advances in Information Mining and Management, due Feb 28
- WI'14: Web Intelligence, due Mar 2
- VAST 2014: IEEE Conf. on Visual Analytics Science and Technology , due Mar 21
- DA 2014: DATA ANALYTICS 2014, The Third Int. Conf. on Data Analytics, due Mar 28
- PGM 2014: The Seventh European Workshop on Probabilistic Graphical Models, due May 12
My current visualization of Data Science is a train that is going 99% faster than the rails can support. Mark Biernbaum post on KDnuggets.