CRN 50 Emerging Big Data Vendors

We examine CRN top 50 Emerging Big Data Vendors, with 65% located in Silicon Valley. The prototypical company is located in San Francisco and develops software for Hadoop analytics platform. Competition will be tough!

By Grant Marshall, June 2014.

The CRN 50 Emerging Big Data Vendors includes companies from the infrastructure, business analytics, and data management lists that are up-and-coming.

Since these companies were chosen to be those that are showing great promise for their relatively young age (the oldest company on this list was founded in 2008), their average age is very young (<4 years).

These companies come from a variety of different backgrounds; some of them handle the infrastructure side of big data (e.g. Pivotal and Xplenty) while others are focused more on business analytics (e.g. Alpine Data Labs and Numerify).

Silicon Valley (CA) dominates this list with 34 companies, top locations being San Francisco (12 companies), Mountain View (7), and Palo Alto (5). Outside of California, top locations are MA (Massachusetts) - 4 companies, and Israel - 3 companies.

Below is the word cloud from the companies descriptions, with most common words - data, CA, Hadoop, software, develops, analytics, and platform. Thus the prototypical emerging Big Data company is located in San Francisco, CA and develops software for Hadoop analytics platform. Competition will be tough.
CRN 50+ Emerging Big Data Vendors, Word Cloud

Here are the CRN 50 (actually 52) Emerging Big Data Vendors:

  • Actifio has developed a copy data management platform that eliminates the problem of "data sprawl" across a company by creating a single copy of an organization's production data and making it virtually available for backup, disaster recovery, software development and testing, business analytics and archiving purposes. Waltham, MA. Founded 2009.
  • Aerospike develops a real-time, flash-optimized NoSQL database for running high-performance applications. Mountain View, CA. Founded 2009.
  • Alpine Data Labs offers an advanced, Hadoop-based data analytics platform. San Francisco, CA. Founded 2010.
  • Alteryx's software is used to blend structured and unstructured data from a range of sources into one database, conduct predictive, spatial and statistical analysis tasks, and then share the results. Irvine, CA. Founded 2010.
  • Appuri operates a cloud-based customer data system that captures customers- touchpoint data from internal and external sources and creates a petabyte-scale data warehouse within a dedicated virtual private cloud. Redmond, WA. Founded 2012.
  • Ayasdi's Insight Discovery Platform, which utilizes "topological data analysis" technology combined with machine learning techniques, provides insights derived from data that help organizations solve complex problems without writing code or queries. Palo Alto, CA. Founded 2008.
  • Chartio develops cloud-based data visualization software that businesses use to combine data sets and create charts and dashboards for analysis -- all without the need to develop an on-premise data warehouse. San Francisco, CA. Founded 2010.
  • Cirro develops a next-generation data federation platform that makes it possible for nontechnical users to query and explore structured and unstructured data from multiple sources and perform complex analytical tasks. Aliso Viejo, CA. Founded 2010.
  • Citus Data developed CitusDB, a distributed analytics database that can run SQL queries and, according to the company, process petabytes of data in seconds. San Francisco, CA. Founded 2010.
  • ClearStory Data's Data Intelligence software is designed to make it easier to access internal and external data sources, including corporate databases, Hadoop and the Internet, and use that data to uncover trends and patterns. Palo Alto, CA. Founded 2011.
  • Cloudera Enterprise is the vendor's distribution of the Hadoop platform, coupled with system management (Cloudera Manager) and data management (Cloudera Navigator) tools. Palo Alto, CA. Founded 2008.
  • Concurrent offers application middleware technology that businesses use to develop, deploy, run and manage big data applications. San Francisco, CA. Founded 2008.
  • Continuuity. One problem with Hadoop is the shortage of skilled developers capable of building applications that leverage its capabilities. Palo Alto, CA. Founded 2011.
  • Continuum Analytics released Anaconda 1.9, the latest version of its collection of libraries for big data management analysis and cross-platform visualization for business intelligence, scientific, engineering and machine learning tasks. Austin, TX. Founded 2011.
  • Couchbase develops and supports Couchbase Server, a commercial version of Apache CouchDB, the open-source, document-oriented NoSQL database. Mountain View, CA. Founded 2011.
  • DataGravity's website says the company's mission is "turning data into information" and "make storage an active asset for SMBs." Nashua, NH. Founded 2012.
  • DataHero. Under the motto of "Analytics Simplified," this company develops software that analyzes data and automatically creates visualizations -- charts and graphs -- from the information without the need for the user to tackle complex coding. San Francisco, CA. Founded 2011.
  • Datameer. Founded by some of the original contributors to Apache Hadoop, Datameer develops software that helps business users of Hadoop integrate, analyze and visualize large volumes of data. San Mateo, CA. Founded 2009.
  • DataRPM developed "cognitive data discovery" technology that lets users analyze and visualize data residing in corporate databases, Hadoop or other sources using a natural language query and search interface. Fairfax, VA. Founded 2012.
  • DataSift develops a social data platform that businesses use to monitor social media such as Twitter, aggregate and filter data from public social conversations, and extract insights from that data. San Francisco, CA. Founded 2010.
  • DataStax developed a massively scalable data platform based on Apache Cassandra, the open-source distributed database for storing and managing huge amounts of data across multiple data centers and the cloud. Santa Clara, CA. Founded 2010.
  • Domo offers a cloud-based executive management platform the company said gives users access to information scattered across myriad sources through a single dashboard. American Fork, UT. Founded 2011.
  • Gainsight develops predictive analytics software that's integrated with's CRM applications and helps users scrutinize customer data for customer retention and identify cross-sell and upsell opportunities. Mountain View, CA. Founded 2009.
  • Gazzang's technology provides data security in big data and cloud computing environments, securing personally identifiable information, preventing unauthorized access to sensitive data and systems, and helping organizations comply with data security regulations. Austin, TX. Founded 2010.
  • Glassbeam develops Software-as-a-Service applications for product analytics based on machine log data, putting it in a key position in business intelligence in the nascent-but-growing Internet of Things market. Sunnyvale, CA. Founded 2009.
  • Hortonworks, launched in 2011, offers the Hortonworks Data Platform, a system based on Apache Hadoop combined with tools for data management, integration, security, provisioning and other software for enterprise data processing. Palo Alto, CA. Founded 2011.
  • JethroData develops an index-based SQL engine for Hadoop that it says combines the scalability of HDFS (the Hadoop file system) with the power of a fully indexed columnar analytical database. Natanya, Israel. Founded 2012.
  • Jut is a company that remains in stealth mode as it develops software for capturing and analyzing big data. San Francisco, CA. Founded 2013.
  • MapR Technologies competes with Cloudera, Hortonworks and other vendors in the Hadoop arena, building on its distribution of Hadoop and other open-source Apache software to create a complete big data platform for both operational and analytical purposes. San Jose, CA. Founded 2009.
  • MemSQL calls itself "the leader in real-time and historical big data analytics based on a distributed in-memory database." San Francisco, CA. Founded 2011.
  • Metric Insights pitches its "push intelligence" technology as an antidote to business intelligence reports and dashboards that the company said makes users hunt for information. San Francisco, CA. Founded 2010.
  • Numerify is built on ServiceNow's IT management software and collects and analyzes operational and financial data about an organization's IT systems that managers use to monitor system performance and make decisions about IT assets and capacity. Cupertino, CA. Founded 2012.
  • Paradigm4 is another of the current crop of startups that's finding ways to apply leading-edge technology to the problem of analyzing massive volumes of data for complex problems in financial services, life sciences and other data-intensive industries. Waltham, MA. Founded 2010.
  • ParStream develops a distributed, massively parallel processing columnar database that's designed to analyze and filter billions of records in sub-second time. Cologne, Germany. Founded 2008.
  • Paxata is in the business of "adaptive data preparation," offering technology that simplifies the often-tedious work of transforming raw data into data that can be analyzed with business analytics tools such as QlikTech and Tableau (both Paxata partners). Redwood City, CA. Founded 2012.
  • Pivotal. The goal, according to the company, is creation of software applications that leverage "big and fast data" on a single, cloud-independent platform. San Francisco, CA. Founded 2013.
  • Platfora offers a big data analytics toolset that's native to the Hadoop platform, allowing users to directly analyze data in Hadoop without the need to build a separate data warehouse system. San Mateo, CA. Founded 2011.
  • Qubole develops a Hadoop-based big data platform, the Qubole Data Service, which runs in the cloud. Mountain View, CA. Founded 2011.
  • Rubikloud Technologies has developed a cloud-based, real-time data analytics platform for processing, analyzing and searching continuous streams of data. Toronto, Canada. Founded 2013.
  • ScaleARC develops database infrastructure software the company says simplifies the way database systems are deployed and managed. Santa Clara, CA. Founded 2009.
  • Seeq is developing software and services that help businesses derive insights from industrial process data, such as information collected from sensors and instrument systems, to aid with operational continuous improvement. Seattle, WA. Founded 2013.
  • SiSense offers business intelligence and dashboard applications for analyzing and visualizing data collected from multiple sources. Tel Aviv, Israel. Founded 2010.
  • Splice Machine has been developing a full-featured, transactional SQL database on Hadoop that can run operational applications and real-time analytics using Hadoop data. San Francisco, CA. Founded 2012.
  • Sqrrl was started quietly, but got a lot of attention in the past year given that its founders came from the National Security Agency and helped develop that organization's massive database. Cambridge, MA. Founded 2012.
  • Sumo Logic brings big data analytics to IT management, calling itself "the next-generation machine data analytics company." Mountain View, CA. Founded 2010.
  • TempoDB offers a database service specifically designed for time-series data, a problem that many databases have trouble handling. Chicago, IL. Founded 2011.
  • Treasure Data offers a cloud-based data warehouse (data analytics Platform-as-a-Service) that operates on a subscription model. Mountain View, CA. Founded 2011.
  • Via Science "applies big math" to solve complex analytics problems. Cambridge, MA. Founded 2000.
  • WibiData develops software that helps businesses develop predictive customer-facing applications on the Hadoop platform. San Francisco, CA. Founded 2010.
  • Xplenty offers a cloud-based data integration service running on Hadoop, providing an alternative to using on-premise data ETL (extract, transform, load) tools to integrate structured and unstructured data. Tel Aviv, Israel. Founded 2011.
  • Zettaset is the creator of Orchestrator, software that businesses use to manage and secure their Hadoop big data clusters Mountain View, CA. Founded 2009.
  • Zoomdata develops software that allows users to connect, visualize and interact with data through browsers and mobile devices. Reston, VA. Founded 2012.

Original Post: Big Data 100: The Emerging Big Data Vendors