KDnuggets™ News 14:n10, Apr 30

Features (9) | Opinions (5) | Software (3) | News (6) | Webcasts (1) | Courses (3) | Meetings (4) | Jobs (10) | Academic (2) | Publications (4) | Tweets (6) | CFP (9) | Quote


Opinions and Interviews


  • MLTK: Machine Learning Toolkit in Java - free download - Apr 27, 2014.
    MLTK is a collection of machine learning algorithms in Java, supporting Generalized Linear Models: Ridge, Lasso, Elastic Net, Regression Trees, Random Forests, and more. Free download under BSD license.
  • Apache Spark, the hot new trend in Big Data - Apr 18, 2014.
    Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. Leveraging Hadoop Yarn, Alpine has made it very simple to get started with Spark.
  • Examining GoodData Open Analytics Platform - Apr 16, 2014.
    KDnuggets examines the main features of GoodData Open Analytics Platform, its users, how it compares to competition, and future plans.


  • SIGKDD Data Science/Data Mining PhD Dissertation Award - Nominations Due Apr 30 - Apr 23, 2014.
    This annual award by ACM SIGKDD seeks to recognize outstanding research by doctoral candidates in the field of data mining, data science, and knowledge discovery. Nominations due Apr 30.
  • Big Data Leads Top Paying Skills - Apr 29, 2014.
    Big Data related skills led the list of top paying technical skills (six-figure salaries) in 2013. Several other useful insights are available in the Dice Tech Survey Report, available for free download.
  • Microsoft Expands Big Data Platform - Apr 21, 2014.
    Microsoft expands its data platform with 3 major features: SQL Server 2014 with in-memory technology, Azure Intelligent Systems Service, and Analytics Platform System - SQL Server + Hadoop. New CEO Satya gives low-key but impressive presentation.
  • Big Data TechCon - Great How-To Conference - Apr 17, 2014.
    The recent BigData TechCon conference in Boston featured practical, how-to classes and tutorials for IT and Big Data professionals. It is the how-to training conference for professionals implementing and analyzing Big Data.
  • Top stories for Apr 20-26 - Apr 27, 2014.
    Elusive Data Scientists Driving High Salaries; Data Workflows for Machine Learning; New Book: Social Media Mining - free PDF download; Microsoft Expands Big Data Platform.
  • Top stories for Apr 13-19 - Apr 20, 2014.
    Top LinkedIn Groups in 2014 for Analytics, Big Data, Data Science; Data Analytics Handbook, free download; Apache Spark, the hot new trend in Big Data; GoodData Open Analytics Platform.

Webcasts and Webinars


  • USC Marshall MS in Business Analytics - Apr 24, 2014.
    USC Marshall new MS in Business Analytics will give you the tools to leverage big and unstructured data for effective decision-making - study full or part-time, and customize your degree to your career goals.
  • TESC Online, Affordable MBA in Data Analytics - Recharge Your Career - Apr 22, 2014.
    This online program enables graduates to use advanced data analytics to drive continuous improvement in business and organizations, and lets you earn credit for professional certifications/expertise.
  • UC Berkeley Master of Information and Data Science, Online - Apr 17, 2014.
    This online degree is for professionals who want to become leaders in the field of data science. Students benefit from UC Berkeley strong ties to Silicon Valley and multidisciplinary approach that teaches the entire data life cycle.



  • Microsoft: Applied Researcher - Apr 26, 2014.
    Be at the forefront of Big Data, command many thousands of machines, process petabytes of data, and not just answer the question given but define what the right question to answer is.
  • Paychex: Manager, Risk Modeling & Review - Apr 25, 2014.
    Lead efforts to plan, design strategy, build, deploy and monitor predictive models to leverage revenue opportunities, and mitigate risks.
  • Great West Casualty Company: Predictive Modeler - Data Analyst - Apr 22, 2014.
    Researching, analyzing and developing predictive models in support of company mission - be the premier provider of insurance products and services for truckers.
  • Apple: Data Scientist - iTunes - Apr 20, 2014.
    Apple has a tremendous amount of data, and we have just scratched the surface in pattern detection, anomaly detection, predictive modeling, and optimization. We encourage scientists to stay abreast of research by attending conferences and working with academy.
  • Apple: iAd - Senior Software Engineer - Apr 20, 2014.
    Apple advertising is redefining the advertising experience on mobile devices. Be part of a dynamic team building high performance and scalable applications.
  • Videology: Data Mining Scientist - Apr 17, 2014.
    Videology is a leading technology company in the digital advertising industry. Support data mining and machine learning efforts, both for research and for ad optimization products.
  • Amazon: Business Intelligence Engineer, Mobile Business Development - Apr 17, 2014.
    Talented BI Engineer who is passionate about using data to drive crucial business decisions regarding our activity in the mobile ecosystem.
  • Bosch: Data Mining Engineer - Big Data Infrastructure - Apr 17, 2014.
    Bring together disparate technologies and use data mining and analytics to solve business problems in Predictive Maintenance, Health Informatics, Vehicle Diagnostics, Manufacturing, and other domains.
  • Best Practice Partners: Principal Consultant, Clinical Innovations and Physician Engagement - Apr 16, 2014.
    Lead the development and delivery of the client company clinical quality improvement and provider engagement services to assist clients in validating and ensuring improved outcomes.
  • Apple: BI Applications Developer - Apr 16, 2014.
    Have a startup mentality rather than a IT shop mentality, develop solutions to create different business analytical reports from vast amount of data.

Academic/Research positions

  • NDSU: Informatics Postdocs - Apr 26, 2014.
    Join a dynamic team performing groundbreaking research in the area of combinatorial cheminformatics, and propose independent strategic research themes in energy-related computational chemistry.
  • Idiap/EPFL: PhD and Internship Positions in Social Computing - Apr 24, 2014.
    The Social Computing Research Group at Idiap/EPFL has 3 PhD positions and several internship positions to help research in social media, ubiquitous computing, and computational social science.


  • 9 Free Books for Learning Data Mining and Data Analysis - Apr 29, 2014.
    Whether you are learning data science for the first time or refreshing your memory or catching up on latest trends, these free books will help you excel through self-study.
  • Where are your users? Geo-localization with KNIME - Apr 28, 2014.
    Learn how KNIME can help you improve user understanding through Geo-localization of IP addresses and dynamic visualization. Access free white paper for more details.
  • New Book: Social Media Mining - free PDF download - Apr 22, 2014.
    Social Media Mining integrates social media, social network analysis, and data mining to enable students, practitioners, researchers, and managers to understand the basics and potentials of this field.
  • Data Workflows for Machine Learning - Apr 20, 2014.
    Paco Nathan compares several open source frameworks for Machine Learning workflows, including KNIME, IPython Notebook and related libraries, Cascading, Cascalog, and Spark/MLbase, and proposes 9 criteria to evaluate the best alternatives.

Top Tweets

  • Top KDnuggets tweets, Apr 25-27 - Apr 28, 2014.
    Recommended Tutorials for Data Scientists from PyCon 2014; How One Woman Hid Her Pregnancy from #BigData; MLTK: Machine Learning Toolkit in Java - free download; Deep Learning for Natural Language Processing.
  • Top KDnuggets tweets, Apr 23-24 - Apr 25, 2014.
    #BigData Cartoon: "It does look similar - but this one is powered by Hadoop"; Great list: 9 Python Machine Learning Books; Why people are bad at technology predictions; Too busy recommending things to experience them.
  • Top KDnuggets tweets, Apr 21-22 - Apr 23, 2014.
    Sweet! Chocolate Consumption strongly correlated to Nobel Prizes; Cheat Sheets for Data Scientists; New Book: Social Media Mining - free PDF download; Elusive Data Scientists Driving High Salaries.
  • Top KDnuggets tweets, Apr 18-20 - Apr 22, 2014.
    Cross-validation pitfalls for regression/classification and how to avoid them; Data Workflows for Machine Learning ; Apache Spark, the hot new trend in Big Data ; Visual Analysis Best Practices - download a free guidebook from Tableau.
  • Top KDnuggets tweets, Apr 16-17 - Apr 19, 2014.
    Scikit-Learn: a great python library for machine learning; A map of where nobody lives in the US; Apache Spark, the hot new trend in Big Data ; NYU @aghose on Est. Demand for Mobile Apps - Learn more: NYU Stern MS in Biz Analytics.
  • Top KDnuggets tweets, Apr 14-15 - Apr 16, 2014.
    9 Free Books for Learning Data Mining and Data Science; Coursera #DataScience Specialization: 10 courses from JHU; Top LinkedIn Groups in 2014 for Analytics, Big Data, Data Mining, and Data Science; EMC Data Science and Big Data Analytics Offer.

CFP - Calls for Papers


Lesson: If you are a CIO, clean up your goddamn room; you're not going out until you do! Michael Brodie, in KDnuggets Interview on Data Curation, Cloud Computing, Startup Quality, Verizon (part 2)