KDnuggets™ News 14:n27, Oct 22

Features | Software | Opinions | Interviews | Reports | News | Webcasts | Courses | Meetings | Jobs | Academic | Publications | Tweets | CFP | Quote



  • Overcoming Text Analytics Barriers - Oct 17, 2014.
    Getting the value from companies text assets can be both time consuming and expensive. Learn how to overcome these barriers with "Overcoming Text Analytics Barriers" whitepaper, and at Text Analytics Summit West in San Francisco, Nov 4-5. KDnuggets discount.
  • DataLadder outperforms IBM and SAS in Record Linkage - Oct 16, 2014.
    Data scientists from the Centre for Data Linkage at Curtin U. found that Connecticut-based firm Data Ladder has outperformed several major companies on record linkage.
  • ADW, free software to measure semantic similarity - Oct 13, 2014.
    ADW is a software for measuring semantic similarity of arbitrary pairs of lexical items, from word senses to texts, based on "Align, Disambiguate, and Walk", a WordNet-based state-of-the-art semantic similarity approach. Get it on github.
  • Interactive Network and Graph Data Repository - Oct 17, 2014.
    The network repository currently hosts over 500+ graphs/networks that span 19 collections of graphs from social science, machine learning, scientific computing, and many others.
  • Develve statistical software, free for non-commercial use - Oct 10, 2014.
    Check out Develve 2.0, a six-sigma tool, the new version featuring new utilities for measure system analysis and the design of sophisticated experiments.





Webcasts and Webinars




  • Apple: Data Analyst - Oct 20, 2014.
    The Maps team is looking for self-motivated team players who are fascinated by data, are curious about patterns and anomalies, and want to derive insights from data to improve our products.
  • Adobe: Sr. Machine Learning Engineer (C++ / Big Data) - Oct 17, 2014.
    Be part of the development team that produces a complex, high-performance analytics product called Adobe Data Workbench. Have superior problem-solving skills an a knack for developing large-scale, high-performance, complex systems.
  • Catalytic DS: Biomedical Text Mining Developer - Oct 16, 2014.
    Help develop cloud-based text analytics solutions that enable researchers to use biomedical information locked in vast repositories of 'read only' scientific publications.
  • Pacific Life: Sr. Data Scientist - Oct 15, 2014.
    Analyze big data to gain a better understanding of our customers - how they interact with our company and our products and what they desire moving forward.
  • ArrowStreet Capital: Research Associate - Oct 14, 2014.
    Join investment research team, support coding new ideas into signals, testing them, and producing return and risk forecasts to drive trading decisions.
  • Analatom: Artificial Intelligence / Data Mining Engineer (US Permanent Residency or Citizenship Required) - Oct 12, 2014.
    Work with large data sets, complex algorithms, and team members from a variety of specialties to solve hard problems. Support a range of clients, including front-line analysts, researchers, and senior leadership.
  • GuideOne: Lead Predictive Modeler - Oct 11, 2014.
    Responsible for advancing the use of predictive models and analytics, project management, planning and delivering predictive models and statistical analysis.
  • Booking: Product Owner - Data Science - Oct 11, 2014.
    Use our data to create products and features improving our customers experience, with a strong focus on driving conversion and customer loyalty.
  • Alibaba: Senior Data Scientist - Oct 9, 2014.
    Develop rich insight into consumer behaviors, preferences and experiences from the vast resources in order to improve the customer experience across a broad range of areas.

Academic and Research positions


Top Tweets

  • Top KDnuggets tweets, Oct 17-19 - Oct 20, 2014.
    Air traffic data analyzed to predict Ebola spread;
    Some cool public data sources you can use for your next data science project;
    Data science can't be point and click ! Finding random correlation is too easy;
    Bayes Rule in an animated gif.
  • Top KDnuggets tweets, Oct 15-16 - Oct 17, 2014.
    STOP and THINK, sometimes the simplest caption is the best;
    This model tracks Ebola outbreak well so far, predicts Ebola to burn out in December;
    BAH launches online course "Explore Data Science";
    Watch: R wizard Hadley Wickham dplyr tutorial at useR! 2014 conf.
  • Top KDnuggets tweets, Oct 13-14: Data mining classics - Oct 15, 2014.
    Also - The Open Source Data Science MS Curriculum: UW/Coursera + Harvard ;
    Statistical Modeling vs Machine Learning - mapping the terms and concepts;
    Very useful! Python 2.7 Quick Reference Sheet.
  • Top KDnuggets tweets, Oct 10-12 - Oct 13, 2014.
    7 Most Data Rich Companies in the World;
    R and #DataScience Webinar slides - status, why, code examples;
    Another list of 200+ #BigData thought leaders to follow on Twitter;
    Popular #BigData predictive apps and APIs.
  • Top KDnuggets tweets, Oct 8-9 - Oct 10, 2014.
    IBM #Watson presentation: Clinical data determines only 10% of health;
    A @Kaggle hero 100-line Python code for online logistic regression;
    The Winner of Kaggle Criteo Data Science on his Odyssey;
    For Data Viz lovers: Keynote by Tableau CEO Christian Chabot on "Art of Analytics".
  • Top KDnuggets tweets, Oct 6-7 - Oct 8, 2014.
    Great TED talk by @KnCukier "Big Data is better data";
    Top 10 One-Person Startups;
    7 critical elements of effective dashboards and visualizations;
    Making Sense of Public Data - Wrangling Jeopardy.

CFP - Calls for Papers


Ebola is presently an example of "small data" (let's hope it stays this way), but its data has many problems which lend themselves to useful data science lessons that can be applied to Big Data as well. Gregory Piatetsky, Oct 2014.