Industry Predictions: Key Trends in 2017

With 2017 almost upon us, KDnuggets brings you opinions from industry leaders as to what the relevant and most important 2017 key trends will be.



At KDnuggets, we try to keep our finger on the pulse of main events and developments in industry, academia, and technology. We also do our best to look forward to key trends on the horizon.

In this post we present predictions from those in industry, which do not follow a prescribed question but which do address what to keep a look out for in different sectors in the upcoming year. Quotes are organized alphabetically by name of the company which they have been submitted on behalf of, and we have reserved the right to edit (extract excerpts from) for both content and length.

As the predictions come from across industry and reach into its many different niche sectors, there is no general consensus or over-arching themes herein, which makes intuitive sense. You will, however, read about the impact of GPUs, the future of the data lake, both NoSQL and SQL, IoT, machine learning, deep learning, and much more.

Industry predictions 2017 header

Adam Wray, CEO Basho

In 2017, NoSQL’s coming of age will be marked by a shift to workload-focused data strategies, meaning executives will answer questions about their business processes by examining the data workloads, use cases and end results they’re looking for. This mindset is in contrast to prior years when many decisions were driven from the bottom up by a technology-first approach, where executives would initiate projects by asking what types of tools best serve their purposes. This shift has been instigated by data technology, such as NoSQL databases, becoming increasingly accessible.

In 2017, organizations will stop letting data lakes be their proverbial ball and chain. Centralized data stores still have a place in initiatives of the future: How else can you compare current data with historical data to identify trends and patterns? Yet, relying solely on a centralized data strategy will ensure data weighs you down. Rather than a data lake-focused approach, organizations will begin to shift the bulk of their investments to implementing solutions that enable data to be utilized where it’s generated and where business process occur - at the edge. In years to come, this shift will be understood as especially prescient, now that edge analytics and distributed strategies are becoming increasingly important parts of deriving value from data.

Andrew Brust, Senior Director, Market Strategy and Intelligence, Datameer

In 2017, the reports of Big Data’s death will be greatly exaggerated, as will the hype around IoT and AI. In reality, all of these disciplines focus on datacapture, curation, analysis and modeling. The importance of that suite of activities won’t go away unless all businesses cease operation.

Joanna Schloss, Director of Product Marketing, Datameer

1. Hadoop distribution vendors will have crossed the chasm — unstructured data in Hadoop is a reality. But, since the open source problem has not been addressed, they aren’t making much money. As such, there will be an acquisition of many of these vendors by bigger players. As well as the idea that bigger ISV Hadoop vendors will band together and create larger entities in hopes of capitalizing on the economy of scale.

2. Data preparation will become more of a feature rather than a market as big data analytics continue to evolve both in product offerings and market share. As such, there may be a consolidation in the marketplace as companies start to acquire product offerings in this area as well as customer lists from small, niche vendors.

3. Artificial intelligence, machine learning, and advanced analytics will become more complex as people start to realize the true potential of these disciplines. All three areas require an excellent understanding of big data and big data analytics and they can eventually evolve into a master discipline of Analytics — or maybe we coin new term for it in the near future.

4. By the end of 2017, the idea of deep learning will have matured and true use cases will emerge. For example, Google uses it to look at faces and then determine if the face is happy, sad, etc. There are also existing use cases in which the police is using it to compare the “baseline” facial structure to "real time" facial expressions to determine intoxication, duress or other potentially adverse activities.

Industry predictions wordcloud

Eric Mizell, Vice President, Global Solutions Engineering, Kinetica

Trend #1: Real Change is Coming to Real-time Intelligence in 2017 with GPUs

Graphical Processing Units (GPUs) are capable of delivering up to 100-times better performance than even the most advanced in-memory databases that use CPUs alone. The reason is their massively parallel processing, with some GPUs containing over 4,000 cores, compared to the 16-32 cores typical in today’s most powerful CPUs. The small, efficient cores are also better suited to performing similar, repeated instructions in parallel, making GPUs ideal for accelerating the compute-intensive workloads required for analyzing large streaming data sets in real-time.

Trend #2: The Cloud will get “turbo-charged” performance with GPUs

Amazon has already begun deploying GPUs, and Microsoft and Google have announced plans. These cloud service providers are all deploying GPUs for the same reason: to gain a competitive advantage. Given the dramatic improvements in performance offered by GPUs, other cloud service providers can also be expected to begin deploying GPUs in 2017.

Trend #3: GPU-accelerated databases will achieve enterprise-class capabilities

Certain enhancements in security and availability that are expected in 2017 will build on the foundation of the GPU’s proven performance and scalability to make their use enterprise-class. For security, support for user authentication, and role-and group-based authorization will make GPU acceleration suitable for applications that must comply with security regulations, including those requiring personal privacy protections. For availability, data replication with automatic failover capabilities will make GPU-accelerated databases sufficiently reliable for even the most mission-critical of applications.

Jeff Catlin, CEO of NLP and sentiment analytics provider Lexalytics

  1. Big Brother will take over more control of what we see and do, but big brother isn’t the government...it’s the big tech corporations: Google, Microsoft, Facebook and Amazon.
  2. AI will continue to get a “black eye” due to sexist and racist issues in large scale training corpora.
  3. AI will continue to be the hot funding item in VC/PE rounds, possibly approaching 25 percent of all events.
  4. We expect three of the well funded ML/AI companies to go out of business, while a number of the lesser funded companies will not get off the ground.

Lloyd Tabb, Founder, Chairman & Chief Technology Officer, Looker

1) Moore’s Law holds true for databases

Per Moore’s law, CPUs are always getting faster and cheaper. Of late, databases have been following the same pattern.

In 2013, Amazon changed the game when they introduced Redshift, a massively parallel processing database that allowed companies to store and analyze all their data for a reasonable price. Since then however, companies who saw products like Redshift as datastores with effectively limitless capacity have hit a wall. They have hundreds of terabytes or even petabytes of data and are stuck between paying more for the speed they had become accustomed to, or waiting five minutes for a query to return.

Enter (or reenter) Moore’s law. Redshift has become the industry standard for cloud MPP databases, and we don’t see that changing anytime soon. With that said, our prediction for 2017 is that on-demand MPP databases like Google BigQuery and Snowflake will see a huge uptick in popularity. On-demand databases charge pennies for storage, allowing companies to store data without worrying about cost. When users want to run queries or pull data, it spins up the hardware it needs and gets the job done in seconds. They’re fast, scalable, and we expect to see a lot of companies using them in 2017.

2) SQL will have another extraordinary year

The innovations we're seeing are blowing our minds. BigQuery has created a product that is essentially infinitely scalable, the original goal of Hadoop, AND practical for analytics, the original goal of relational databases.

SQL engines for Hadoop have continued to gain traction. Products like SparkSQL and Presto are popping up in enterprises and as cloud services because they allow companies to leverage their existing Hadoop clusters and cloud storage for speedy analytics. What’s not to love?

2016 was the best year SQL has ever had — 2017 will be even better.

Paul Pilotte, Technical Marketing Manager, MathWorks

In 2017, I see data science for the masses fueling growth in traditional businesses. The workflows for deep learning, prescriptive analytics, and big data will become much easier and more accessible, making it more cost-effective for businesses to train existing domain experts than hire elusive data scientists. For example, transfer learning will mitigate the need for large training sets and NVidia GPU instances on Amazon EC2 will make it easy for anyone to get started with deep learning in minutes.

A focus for 2017 will also be on intelligently integrating analytics across heterogeneous systems, driven by the need for real-time decision making and sensor data from IoT systems. This means data processing and predictive algorithms will need to work smartly across IT systems, IoT aggregators, hybrid clouds, on-board sensors, and in complex embedded systems.