CRN 2015 Emerging Big Data Vendors

The 2015 Computer Reseller News, CRN Big Data Emerging vendors list features 54 companies launched since 2009 for their innovative tools, technology and committment to help businesses manage the Big Data Challenge.

The top 54 hottest Emerging Big Data technology Vendors impacting the tech space and challenging the established players in the big data field like Microsoft, SAP and Hewlett-Packard are listed below.

Here is a word cloud representing the most common & significant words in the list of companies description.


A majority of these companies use open source Apache software like Apache Spark, Hadoop, Hive, Cassandra and Kafka. An emerging trend seems to be that most of these companies are competing to provide Hadoop-as-a-Service big data analytics platform.

It is also interesting to note that about 11 companies in the list are heading towards tools that support both historical and real-time data analytics platform i.e real time stream processing for analyzing and searching continuous streams of massive data volumes.

While most of these companies are focused on open-source, NoSQL database technology; they also carry a specialised focus in areas like data accessibility technology (Alation, AtScale, ClearStory), cost-effective technology for building big data systems (Bluedata, TreasureData), business analytics for HR management (Visier), customer retention (Gainsight, Interna), text analytics of customer fanbase (Luminoso), business analytics for sales & marketing for SMBs (InsightSquared), storing and managing huge data volumes with search and discovery capabilities (DataStax, DataGravity), big data analytics for security technology (Sqrrl, Platfora) and single dashboard based access to meaningful results (Zoomdata, Domo, Metric Insights). Let us explore more!

In the list below, companies which were not in CRN Big Data Management Companies 2014 list are indicated with new.

  1. Actifio markets a copy data management platform that eliminates the problem of "data sprawl" across a company by creating a single copy of an organization's production data and making it virtually available for backup, disaster recovery, software development and testing, business analytics and archiving purposes. Waltham, MA. Founded 2009.
  2. Aerospike develops an open-source NoSQL database for running high-performance applications. Mountain View, CA. Founded 2009.
  3. new Alation,the startup that just came out of stealth mode in March, debuting its data-accessibility technology that's designed to help people more easily find, understand, use and govern their data for making faster and better decisions. Redwood City, CA. Founded 2012.
  4. Alpine Data Labs offers an advanced, Hadoop-based data analytics platform. The company launched Alpine Chorus 5.0, a new release of its flagship product, with functionality for overseeing and managing analytics ecosystems and a framework that analysts and data scientists use for integrating technologies such as R and Spark. San Francisco, CA.
  5. Alteryx combines structured and unstructured data from multiple sources into one database, uses the data to conduct predictive, spatial and statistical analysis tasks, and then shares the results. Irvine, CA. Founded 2010.
  6. new Altiscale is one of several competing startups that provide Hadoop-as-a-Service. The company's Altiscale Data Cloud is an on-demand, pay-as-you-go service based on the Hadoop big data platform. Palo Alto, CA. Founded in 2012.
  7. new AtScale develops the AtScale Intelligence Platform software that allows commonly used business intelligence tools to access data stored in Hadoop clusters. San Mateo, CA. Founded 2013.
  8. new BlueData Software emerged from stealth mode, debuting its BlueData EPIC software platform that uses virtualization technology to make it easier, faster and more cost-effective for businesses to leverage big data by deploying Hadoop-as-a-Service in an on-premises model. Mountain View, CA. Founded 2012.
  9. new Cask is an open-source software company that provides development tools for Hadoop applications and data. The Cask Data Application Platform is used to build, deploy and manage big data applications. Palo Alto, CA. Founded 2011.
  10. Citus Data developed CitusDB, a massively parallel columnar database built on PostgreSQL the company said can process petabytes of data in seconds. The company targets both transactional and analytical processing tasks with the software. San Francisco, CA. Founded 2010.
  11. ClearStory Data's software is designed to make it easier to access internal and external data sources, including corporate databases, Hadoop and the Internet, and use that data to uncover trends and patterns.The company has more tightly integrated its software with the Apache Spark in-memory analytics engine. Menlo Park, CA. Founded 2011.
  12. new Confluent is developing a commercial streaming data platform based on Apache Kafka, the Apache Software Foundation's open-source message broker software. Mountain View, CA. Founded 2014.
  13. Continuum Analytics develops data analytics software based on the Python programming language. The company recently announced that its Anaconda Server would move to the Hadoop and Spark infrastructures. Austin, TX. Founded 2011.
  14. Couchbase competes in the crowded "alternative database" arena against the likes of MongoDB and Cassandra with its Couchbase Server and Couchbase Mobile products, based on open-source, distributed, document-oriented NoSQL database technology that supports massive data volumes in real time. Mountain View, CA. Founded 2011.
  15. new Databricks develops commercial software services around Apache Spark, the open-source, super-fast big data processing engine that turbo-charges Hadoop -- and some industry watchers said could even replace the big data platform. It also includes the Databricks Cloud end-to-end hosted data platform. San Francisco, CA. Founded 2013.
  16. DataGravity last debuted its DataGravity Discover Series of "data-aware" storage appliances after two years of development that not only help businesses manage their data, but provide search-and-discovery capabilities to help them understand how the data is being used. Nashua, NH. Founded 2012.
  17. Datameer develops software that helps business users of Hadoop integrate, analyze and visualize large volumes of data. It launched Datameer Professional, a Hadoop-as-a-Service big data analytics platform specifically designed for departmental deployments. San Mateo, CA. Founded 2009.
  18. DataRPM's Smart Machine Insights platform uses machine learning to automatically perform advanced statistical analysis on Hadoop -- the only practical way to approach analytics across large data sets, according to the company. Fairfax, VA. Founded 2012.
  19. DataStax developed a massively scalable data platform based on Apache Cassandra, the open-source distributed database for storing and managing huge amounts of data across multiple data centers and the cloud. Santa Clara, CA. Founded 2010.
  20. new DataTorrent develops its DataTorrent RTS realtime stream processing system, based on Hadoop 2.0, that businesses use to process, monitor, analyze and act on big data instantly. Santa Clara, CA. Founded 2012.
  21. Domo debuted its cloud-based executive management system that provides business managers with access to information scattered across many disparate sources through a single dashboard. Santa Clara, CA. Founded 2012.
  22. Gainsight develops cloud-based predictive analytics software that's integrated with's CRM applications to help users scrutinize customer data for customer-retention purposes and identify cross-sell and upsell opportunities. Redwood city, CA. Founded 2009.
  23. Glassbeam develops Software-as-a-Service applications for machine log data analytics, putting it in a key position in business intelligence in the nascent-but-growing "Internet of Things" market. Santa Clara, CA. Founded 2009.
  24. new H2O develops Software-as-a-Service applications for machine log data analytics, putting it in a key position in business intelligence in the nascent-but-growing "Internet of Things" market. Mountain View, CA. Founded 2011.
  25. Hortonworks offers the Hortonworks Data Platform, a distribution of Apache Hadoop combined with tools for data management, integration, security, provisioning and other software for enterprise data processing. Palo Alto, CA. Founded 2011.
  26. new InsightSquared develops business analytics software for sales and marketing professionals with an emphasis on simplicity and ease-of-use. While the company's software is used by large businesses, the vendor also is winning over small and midsize businesses that often lack the resources to adopt business analytics technology. Cambridge, MA. Founded 2010.
  27. new Interana develops events-based analytical software that works with clickstream data and other information to help users answer questions about how customers behave and how products are used. The goal is to provide actionable business intelligence for nontechnical users. Menlo Park, CA. Founded 2013.
  28. JethroData developed an index-based SQL engine for Hadoop, technology that the company said makes interactive business intelligence with Hadoop possible. Natanya, Israel. Founded 2012.
  29. new Looker Data Sciences develops a web-based business intelligence platform for data-driven companies. The software uses the company's own LookML data description language that businesses use to build customer data applications that work with Amazon Redshift, Teradata Aster, HP Vertica, Greenplum, Google BigQuery and other big data systems. Santa Cruz, CA. Founded 2011.
  30. new Luminoso Technologies is a pioneer in the area of text analytics. The company's software is designed to help businesses understand what customers feel about their company or product by analyzing, say, tweets or Facebook postings. Cambridge, MA. Founded 2010.
  31. MapR Technologies competes with Cloudera, Hortonworks and other vendors in the Hadoop arena, building on its distribution of Hadoop and other open-source Apache software to create a complete big data platform for both operational and analytical purposes. San Jose, CA. Founded 2015.
  32. MemSQL develops an in-memory database that enables businesses to process transactions and perform business analytics simultaneously, using both realtime and historical data, in a single database. San Francisco, CA. Founded 2011.
  33. Metric Insights develops its "push intelligence" technology as an antidote to business intelligence reports and dashboards that make users hunt for information, according to the company. The Metrics Insights software uses a patented "KPI warehouse" to deliver personalized business intelligence, key performance indicators and alerts. San Francisco, CA. Founded 2010.
  34. Numerify is built on ServiceNow's IT management software and collects and analyzes operational and financial data about an organization's IT systems that managers use to monitor system performance and make decisions about IT assets and capacity. Cupertino, CA. Founded 2012.
  35. ParStream develops a distributed, massively parallel processing columnar database that's designed to analyze and filter billions of records in sub-second time. Cupertino, CA. Founded 2008.
  36. Paxata develops "self-service adaptive data preparation" software that simplifies the often tedious work of transforming raw data so that it can be analyzed with business analytics tools. Redwood City, California. Founded 2012.
  37. new Pepperdata has developed a realtime cluster optimizer for Hadoop that monitors and controls all hardware usage (CPU, disk I/O, memory and networks). That helps IT departments better manage jobs running on Hadoop and get the most out of their Hadoop deployments. Sunnyvale, California. Founded 2012.
  38. Pivotal is the big data joint venture between storage giant EMC and VMware. Pivotal's mission is to create software applications that leverage "big and fast data" on a single, cloud-independent platform. Palo Alto, California. Founded 2013.
  39. Platfora offers a big data analytics toolset that's native to the Apache Hadoop platform, allowing users to directly analyze data in Hadoop without the need to build a separate data warehouse system. The software is offered for on-premise deployments or as a cloud service. San Mateo, California. Founded 2011.
  40. new Predixion Softwareoffers a cloud-based, self-service predictive analytics platform called Predixion Insight that's designed for business analysts and other nontechnical users. Aliso Viejo, California. Founded 2009.
  41. Qubole is one of several startups that offer a big data Hadoop-as-a-Service platform. The Qubole Data Service runs on Amazon AWS, the Google Compute Engine and Microsoft Azure. Mountain View, California. Founded 2012.
  42. Rubikloud Technologies targets the retail industry with its cloud-based, realtime data analytics platform for processing, analyzing and searching continuous streams of data from multiple sources. The goal is marketing optimization, customer optimization, product optimization and pricing optimization. Toronto, Canada. Founded 2013.
  43. SiSense offers business intelligence and dashboard applications that business users employ for analyzing and visualizing data collected from multiple sources. The company boasts that everyday business workers can use its products without the need for coding or help from the IT department. Tel Aviv, Israel. Founded 2010.
  44. newSnowflake Computing is positioning as a more flexible, easier-to-manage alternative to traditional on-premise data warehouse systems. It's also competing with other cloud data warehouse offerings such as Amazon Web Service's Redshift and Google's Big Query. San Mateo, California. Founded 2012.
  45. Splice Machine developed a full-featured, transactional SQL database on Hadoop that can run operational applications and realtime analytics using Hadoop data. After months of development and beta testing, the company began shipping Release 1.0 of its software in November. San Francisco, CA. Founded 2012.
  46. Sqrrl's founders came from the super-secret National Security Agency and helped develop that organization's massive database. The Sqrrl Enterprise database offers column, graph and document store capabilities to power big data applications. The product's real forte is its ability to scale up and provide data security at the cell level.Cambridge, MA. Founded 2012.
  47. Sumo Logic is directly challenging Splunk, calling itself "the next-generation machine data analytics company." Sumo Logic's software analyzes IT performance data in realtime, providing actionable insights for IT operations, application management, and security and compliance managers. Mountain View, CA. Founded 2010.
  48. new Tamr has the stated goal of preventing "schema proliferation." Tamr develops enterprise data unification software that businesses use to integrate diverse, siloed data for business analytics tasks. Cambridge, MA. Founded 2013.
  49. new ThoughtSpot Under the mantra "search-based analytics for everyone," ThoughtSpot wants to eliminate the need for complex BI tools. The company's ThoughtSpot Relational Search Appliance combines data from on-premise, cloud and desktop sources and provides users with the ability to access that data with a simple search interface. Palo Alto, CA. Founded 2012.
  50. Treasure Data offers a cloud-based data warehouse (data analytics Platform-as-a-Service) that operates on a subscription model. The idea is to provide sophisticated data warehouse capabilities to businesses without the huge costs and development times associated with on-premise systems. Mountain View, CA. Founded 2011.
  51. new Trifacta develops technology that's used to transform raw, complex data into clean and structured formats for analysis. Trifacta calls it "data wrangling." San Francisco, CA. Founded 2012.
  52. new Visier brings business analytics to the realm of human resources management. The company's cloud-based applications pull together data about a company's workforce -- everything from salaries to skillsets -- and help managers with their workforce planning chores. It can help with succession planning and even identify critical employees who may be in danger of leaving. Vancouver, BC. Founded 2010.
  53. Xplenty 's cloud-based, Hadoop-as-a-Service platform integrates and transforms structured, semistructured and unstructured data into analyzable data. Tel Aviv, Israel. Founded 2011.
  54. Zoomdata develops software that allows users to connect, visualize and interact with data -- both realtime and historical data -- through browsers and mobile devices. Companies use Zoomdata's software to create dashboards and connect them to disparate data sources. Reston, VA. Founded 2014.

Original post: CRN 2015 Emerging Big Data Vendors