Gold BlogData Science, Predictive Analytics Main Developments in 2016 and Key Trends for 2017

Key themes included the polling failures in 2016 US Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.

We recently asked some of the leading experts in Data Science and Predictive Analytics for their opinion on the most important developments of 2016 and key trends they expect in 2017.

See also a previous post Big Data: Main Developments in 2016 and Key Trends in 2017. A summary of AI & Machine Learning Main Developments and Key Trends will be published next week.

Some of the key themes that emerged are the polling failures in 2016 US Presidential Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.

Here is what the experts thought on Data Science, Predictive Analytics Main Developments in 2016 and Key Trends in 2017.

Data Science Analytics Experts 2016

Kirk D. Borne, The Principal Data Scientist at BoozAllen, PhD Astrophysicist, Top Data Science/Big Data Influencer.

In 2016, I saw several significant data science-related developments, including
  • greater emergence of the citizen data scientist accompanied by a growth in self-service tools for analytics and data science;
  • deep learning being applied across a variety of use cases (including text analytics)
  • emergence of AI-driven chatbots in customer call centers and customer service touch-points;
  • more demands from organizations to see real ROI and benefits from big data and data science, with a focus on "proofs of value" instead of "proofs of concept"; and
  • machine intelligence becoming a significant component of processes, products, and technologies across a broad spectrum of use cases: connected cars, internet of things, smart cities, manufacturing, supply chain, prescriptive machine maintenance, and more.
In 2017, we expect to see greater expansion of edge analytics use cases: machine learning embedded with sensors or close to the point of data collection -- the machine learning may be invoked via APIs or in processors close to the data collector or integrated into the sensor chip architecture itself. The patterns, trends, anomalies, and emergent phenomena (BOI: Behaviors of Interest) that are discovered close to the edge will enable better and faster predictive and prescriptive analytics applications in many domains: cybersecurity, digital marketing, customer experience, healthcare, emergency response, engine performance, autonomous vehicles, manufacturing, supply chain, and more.

Tom Davenport, Distinguished Professor at Babson College, co-founder of the International Institute for Analytics, and a Senior Advisor to Deloitte Analytics.

2016 Developments
  • Decentralization of analytics groups: After a period of consolidation, organizations began to decentralize analytics to business units and functions, in many cases attempting to retain some degree of enterprise coordination.
  • Combinations of proprietary and open source technologies: Many large corporations are making use of proprietary and open source analytics and big data technologies-often combined within a single application.
  • Fragmentation of cognitive technologies: Large, monolithic cognitive technologies have been broken into a series of single-function APIs that can be combined to form complete systems.
  • Fuzzy quantitative roles: Quantitative analysts, data scientists, and developers of cognitive applications have become less distinguishable; clear roles and titles are a thing of the past.
2017 Trends
  • Operational cognitive applications: Cognitive will move from "science projects" to operational applications.
  • Questioning of model assumptions: Polling failures in the 2016 presidential election will lead more managers to question the assumptions behind analytical models.
  • Classification of cognitive tools: More organizations will understand and classify the different cognitive tools available to them and apply them to appropriate business problems.
  • Push for transparency: Owners of strategic and regulated machine learning applications will push for greater transparency in those applications, and will eschew less transparent algorithms.

Tamara Dull, Director of Emerging Technologies at SAS.

This year's "big data" event was the U.S. election cycle because it brought the big data/data science/predictive analytics discussion to the Public Square. Granted, these weren't the terms most folks were using, but in the U.S., we experienced data up-close-and-personal: its' role, its' uses and misuses, its' interpretations and misinterpretations, and its' insights, right and wrong.

As data continues to pervade every aspect of our professional and personal lives, courtesy of the Internet of Things, both the private and public sectors will be pressured to ensure that the collection, use, and analysis of data is safe, secure, and ethical. If a company doesn't get it right, they will cease to exist.

John Elder , founder and chairman of Elder Research, US largest analytics consultancy.

A year ago, Science magazine gave a "runner-up scientific breakthrough of 2015 award" to a study that attempted to replicate 100 top experiments published in psychology journals a few years previous. But researchers were only able to replicate 39. Bad as this is, it is feared to be much better than the track record for Epidemiology, where those published medical "discoveries" appear to be right only 5-35% of the time. Most of the problems of finding spurious correlations, I believe, are due to bad data science. Replacing outdated significance formulas with resampling procedures such as Target Shuffling, would better calibrate how likely random results could arise as strong as the apparent discovery, given the vast search performed by the researcher and the mining software. New criteria would be needed for publication worthiness, but results would be much more reliable, saving massive resources, and even lives.

Anthony Goldbloom, Co-founder and CEO of Kaggle, the leading Data Science competition platform.

Companies like Airbnb, Climate Corporation (now Monsanto) and Opendoor are great examples of how data science can have a big impact. They have built strong data science teams that impact decisions across their companies In 2017, we'll see those companies lead the way in adopting tools and processes that solve some of the big pain points in doing data science: particularly sharing and collaborating on data science workflows and pushing models into production. In 2016, the hot topics in academic research moved from deep neural networks to reinforcement learning and generative models.

In 2017, we should start to see some of these techniques used for pragmatic business use cases. Some of the promising areas for reinforcement learning include algorithmic trading and ad targeting.