The most popular languages continue to be R (used by 61% of KDnuggets readers), Python (39%), and SQL (37%). SAS is stable at around 20%. The highest growth was for Pig/Hive/Hadoop-based languages, R, and SQL, while Perl, C/C++, and Unix tools declined. We also find a small affinity between R and Python users.
Stanford Data Mining and Statistics Online Courses; Data Scientists Guide to Making Money from Start-ups; 2013 Acquisitions in Analytics and Big Data
Top jobs: Research Scientist: Data Mining at Bethesda company, Bethesda, MD; Data Mining Programmer at Real Time Data Solution, Toronto, Canada;
Could a modestly funded group deliver nation-state type effects using only public data? This DARPA SBIR calls to investigate the US national security threat posed by public data and develop tools to characterize and assess the nature, persistence, and quality of the data. Opens: Aug 26, Closes Sep 25, 2013.
We review 2013 acquisitions in Analytics and Big Data, by Actian, EMC, Facebook, Google, IBM, Twitter, WalmartLabs and more. What is the worth of an engineer in acqui-hire?
The focus of this competition is on application of knowledge discovery techniques for protecting personal computer information by means of detection, preventive measures, and responding to various attacks.
Read the discussion on the half-life of a buzzword and is "Data Science" replacing "Business Analytics" as the popular degree title for people interested in data and analytics.
Nate Silver at JSM: 11 statistics principles for journalists; Mining a Data Mining Conference: Analytics on KDD-2013; Coursera Andrew Ng: Education for Everyone
Top jobs: Software Developer, Machine Learning at SGI; Data Mining Programmer at Real Time Data Solution
My report on KDD-2013 Keynote talk by Coursera co-founder Andrew Ng, on Coursera far-reaching experiment in education, which collected more educational data in one year and all the universities in the history of mankind. Andrew Ng believes that great education should not be only for the privileged but should be a fundamental human right.
How should data scientists think about starting or joining a start-up? We summarize the advice from a high-powered KDD-2013 panel of leading data scientists/enterpreneurs who share their start-up experience.
We look at interesting analytics and statistics from KDD-2013 Conference on Knowledge Discovery and Data Mining. Which topics are hot, and which are most likely to be accepted?
REEF (Retainable Evaluator Execution Framework) is a big data framework that sits on top of Hadoop new YARN resource manager, and is especially well suited for building machine learning jobs.
This 4 minute survey want to measure how satisfied are you with your data systems, reporting, and analytics tools. Please take part - answers will be published on KDnuggets
DMA annual analytics challenge, open to academia and industry, and sponsored by Cleveland Clinic, will require the participants to solve a patient re-activation problem.
MarineExplore.org is an Open Data spatio-temporal data platform designed for secure data management and analytics on distributed sources, without ever relocate the data.
The Age of Big Data - BBC Documentary; 10 Enterprise Predictive Analytics Platforms Compared; RapidMiner and Big Data - In-Memory, In-Database, and In-Hadoop
Top jobs: Data Mining, Research SDE at Bing; Analyst - Web Commerce/Marketing at UFC.
This competition goes beyond predictive modeling and delves into the optimization of the flight patterns that participants were asked to predict in the first contest.
Develop novel thinking for fusion of background radiation measurements, GPS, high-resolution video, and LIDAR and propose algorithms for detection, localization, and identification of radiation anomalies. Submissions due Aug 26.
IKANOW Infinit.e is a scalable framework for collecting, storing, processing, retrieving, analyzing, and visualizing unstructured documents and structured records, with community edition (free), enterprise edition, and developer API.
RapidMiner offers flexible approaches to remove any limitations in data set size. This paper compares 3 RapidMiner engines: In-Memory, In-Database, and In-Hadoop.
The Age of Big Data - BBC Documentary; McKinsey eBook (free): Big Data, Analytics, and the Future of Marketing and Sales; Data: Portals, Government, State, City, Local, and Public; Top jobs: Data Scientist, Strategic at Groupon, Palo Alto, CA; Data Scientist at Groupon, Seattle, WA;
The CHEMDNER is a community challenge on named entity recognition of chemical compounds, to promote systems that can detect mentions in text of chemical compounds and drugs.
Are you a grad student/postdoc with a interesting research proposal on NLP, machine learning or data mining in healthcare? Industry partner wants to talk.
HPCC Systems 4.0 is an open-source, enterprise-proven platform for 24/7 Big Data analysis. New features include Eclipse plugin, improved machine learning, and support for Java, Python and R.
Lavastorm Analytics Engine breaks down data silos, giving business users the ability to acquire, integrate, and analyze data 10 times faster than traditional tools. As a first step, read "Breaking Through the Analytics Limitations of Access and SQL" and try our Lavastorm Free for Life software yourself.
KDnuggets Big Data Science Summer Reading List; DataMind: FREE Online Interactive Learning Platform for R; 5 Roles You Need on Your Big Data Team
Top jobs: Statisticians at AIG; PhD Student, Mixing Meta-Modeling and Data-Mining