Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions
Tags: Data Science Platform, Forrester, Gartner, IBM, Knime, Machine Learning, Mike Gualtieri, Predictive Analytics, RapidMiner, SAS
Who leads in Data Science, Machine Learning, and Predictive Analytics? We compare the latest Forrester and Gartner reports for this industry for 2017 Q1, identify gainers and losers, and strong leaders vs contenders.
Predictive Analytics and Machine Learning are among the most important technologies now, as KDnuggets readers no doubt know, and Forrester forecasts a 15% compound annual growth rate (CAGR) for the PAML market through 2021.
The report examines and evaluates 14 firms in terms of strategy, current offering, and Market Presence. The results are summarized in Fig. 1.
Fig. 1: Forrester Wave™: Predictive Analytics And Machine Learning Solutions, Q1 2017
(Source: Forrester Research, Inc. Unauthorized reproduction, citation, or distribution prohibited)
- SAS reimagines its data science portfolio and unifying its data science solutions under SAS Visual Suite. It brings together world-class data prep, visualization, data analysis, model building, and model deployment.
- IBM loves open source. SPSS is still the core of IBM's data science platform, but IBM also launched SystemML from its investments in its Spark Technology Center and introduced the Data Science Experience for data science coders.
- SAP draws a straight-line from predictive models to business applications. SAP offers comprehensive data science tools to build models, but it is also the biggest enterprise application company on the planet.
- Angoss is ready to be your primary solution. Angoss KnowledgeSEEKER is a must-have for data science teams that wish to use beautiful and comprehensive visual tools to build decision and strategy trees.
- RapidMiner wraps breadth and depth in a beautiful package. RapidMiner invested heavily to revamp its visual interface, making it the most concise and fluid that we have seen in this evaluation.
- KNIME's vibrant open source community pays dividends in productivity. KNIME is not a big company, but it has a big community of contributors who continually push the platform forward with capabilities such as bioinformatics and image processing.
- FICO makes enterprise decisions smarter with models. FICO's extensive real-world experience has led to a solution that focuses on the needs of the chief data scientists as well as the rank-and-file data scientists in a large organization.
- H2O.ai puts algorithms first. H2O.ai is best known for developing open source, cluster-distributed machine learning algorithms already in 2011 when big data demanded them but no one else had them. H2O also offers Sparkling Water to create, manage, and run workflows on Apache Spark and Steam to deploy models.
- Microsoft is much more than R for enterprises. Microsoft offers Microsoft R for data scientists who wish to code in the R programming language supported by callable cluster-distributed algorithms. It also offers Azure Machine Learning to data scientists who want a more traditional visual development tool.
- Alpine Data focuses on collaboration. Alpine Data's visual tool provides data engineers, data scientists, and business stakeholders with the capabilities they need to divide and conquer the work of building models.
- Dataiku gets code or click right. Dataiku (name inspired by Japanese haiku) offers a data science platform that lets coders use a notebook when they must, but use visual tools to build workflows when productivity is at a premium.
- Statistica finds a new home, again. Statistica was founded in 1984 as Statsoft and acquired by Dell in 2014. It is now part of the newly relaunched Quest Software.
- Domino Data Labs wants coders to collaborate across open source, with solution aims to package the most popular open source coding tools and libraries and provide a unifying interface for teams of data science coders
- Salford Systems touts accuracy and automation, and loved by its customers, large and small, for its implementation of specific methods including CART, MARS, Random Forests, and TreeNet.
I tried the overlap-style comparison as in Gartner report, but the resulting image is too crowded and unreadable. So here are the highlights:
- Remained Leaders: SAS, IBM, SAP
- From Strong Performers in 2015 to Leaders in 2017: RapidMiner, KNIME, Angoss, FICO
- Remained Strong Performers: Microsoft (gained on offering), Alpine Data, Statistica/Quest (lost on strategy)
- New additions in 2017: Domino Data Labs, Dataiku, H20.ai, Salford Systems
- Dropped in 2017: Alteryx, Predixion Software, Oracle
Gartner and Forrester use different methodologies, but in both cases the further is the circle representing a firm from lower-left corner of the diagram, the better. We measured the that distance for each firm, normalized it so that the largest distance is 95 and smallest is 5, and plotted in Figure 2 below.
Firms not present on either Gartner or Forrester chart got minus (-1) for distance.
Fig. 2: Gartner vs Forrester evaluation of Data Science, Predictive Analytics, and Machine Learning Platforms, 2017 Q1
Circle size corresponds to estimated vendor size, color is Forrester Label, and shape (how filled is circle) is Gartner Label.
Altogether there are 17 firms: 13 that appear in both, 3 only in Gartner (abbreviated as G below), and one only in Forrester (abbreviated as F below).
We see several clusters:
- Strong Leaders: SAS, IBM, RapidMiner, and KNIME are ranked as leaders by both G & F.
- Leaders: Angoss, SAP and FICO are leaders for F but only Niche/Visionaries for G.
- Strong Competitors: H2O.ai, Microsoft, and Statistica/Quest are Strong Performers for F and Visionary/Challenger for G
- Contenders: Alpine Data, Domino, Dataiku: Strong Performers/Contenders in F, Visionaries in G
- Players: Salford Systems, Teradata, Alteryx, MathWorks: only have one ranking