CRN 2015 Big Data Management Companies
Big Data and it's ease-of-use plays a key role in this year’s ‘CRN Big Data 100: Top 30 Data Management companies’. New additions include At Scale, Databricks, and Tamr. A majority of these companies develop open-source NoSQL database technology.
The top 30 CRN Big Data Management companies include those that have demonstrated an ability to innovate in bringing to market products and services that help businesses work with big data. These companies have been striving to help manage an organization's data through next-generation database systems, advanced data storage and integration solutions, and leverage that data for the development of applications.
Here is the word cloud with most common words from the companies descriptions:
Some of the "data buzz words" surrounding these companies you will hear are Data Quality management, data integration, data accessibility and connectivity, data capture, data replication, data sprawl, data wrangling, data unification and master data management.
Silicon Valley (CA) dominates this list with 21 companies, top locations being San Francisco (6 companies) and Redwood city (5). Outside of California, top locations are Massachusetts(4), New York (2), Israel(2), and Washington(1).
Based on this list of companies, developing high quality data integration tools appears to be the focus for many companies to enable businesses process transactions, analyze & understand data easily and perform business analytics simultaneously in real time.
What's interesting to note is that as many as 24 of these companies in the list i.e 80% are relatively young and were founded in this decade! This means that Data Management technology is a fast growing field. The oldest company on this list being Actian, was founded in 1980. The other five companies in order of their age are Attunity (17), Recommind (15), 1010data (15), Informatica (12) and EnterpriseDB (11).
In the list below, companies which were not in CRN Big Data Management Companies 2014 list are indicated with
- 1010 data offers its Big Data Discovery platform for data discovery and sharing applications, especially when working with very large datasets. The company has been particularly successful in winning customers for its software and services within the retail, financial service and gaming industries. New York, NY. Founded 2000.
- Actian positions itself as a leading supplier of SQL analytics for Hadoop with its Actian Analytics Platform. Actian also offers a number of operational database and data integration software products. Redwood City, CA. Founded 1980.
- Actifio markets a copy data management platform that eliminates the problem of "data sprawl" across a company by creating a single copy of an organization's production data and making it virtually available for backup, disaster recovery, software development and testing, business analytics and archiving purposes. Waltham, MA. Founded 2009.
- Aerospike develops an open-source NoSQL database for running high-performance applications. Mountain View, CA. Founded 2009.
- Alation ,the startup that just came out of stealth mode in March, debuting its data-accessibility technology that's designed to help people more easily find, understand, use and govern their data for making faster and better decisions. Redwood City, CA. Founded 2012.
- AtScale develops the AtScale Intelligence Platform software that allows commonly used business intelligence tools to access data stored in Hadoop clusters. San Mateo, CA. Founded 2013.
- Attunity is in the information availability business, providing tools for data replication, change data capture, data connectivity, enterprise file replication, managed file transfer and cloud data delivery. Burlington, MA. Founded in 1998.
- Basho Technologies develops Riak, a distributed, NoSQL database that's designed for tasks that require extremely high availability, and Riak CS, a cloud-based, object-storage database that runs on Riak. Bellevue, WA. Founded 2008.
- Citus Data developed CitusDB, a massively parallel columnar database built on PostgreSQL the company said can process petabytes of data in seconds. The company targets both transactional and analytical processing tasks with the software. San Francisco, CA. Founded 2010.
- ClearStory Data's software is designed to make it easier to access internal and external data sources, including corporate databases, Hadoop and the Internet, and use that data to uncover trends and patterns.The company has more tightly integrated its software with the Apache Spark in-memory analytics engine. Menlo Park, CA. Founded 2011.
- Couchbase competes in the crowded "alternative database" arena against the likes of MongoDB and Cassandra with its Couchbase Server and Couchbase Mobile products, based on open-source, distributed, document-oriented NoSQL database technology that supports massive data volumes in real time. Mountain View, CA. Founded 2011.
- Databricks develops commercial software services around Apache Spark, the open-source, super-fast big data processing engine that turbo-charges Hadoop -- and some industry watchers said could even replace the big data platform. It also includes the Databricks Cloud end-to-end hosted data platform. San Francisco, CA. Founded 2013.
- DataStax developed a massively scalable data platform based on Apache Cassandra, the open-source distributed database for storing and managing huge amounts of data across multiple data centers and the cloud. Santa Clara, CA. Founded 2010.
- DataTorrent develops its DataTorrent RTS realtime stream processing system, based on Hadoop 2.0, that businesses use to process, monitor, analyze and act on big data instantly. Santa Clara, CA. Founded 2012.
- EnterpriseDB provides software and services around the popular PostgreSQL open-source relational database. The company markets the Postgres Plus Advanced Server that's compatible with the Oracle Database, as well as that vendor's database management and replication tools and other products. Bedford, MA. Founded 2004.
- Hazelcast develops in-memory data grid software that evenly distributes data across multiple nodes in a cluster, allowing for better horizontal scaling in both data storage and data processing. The software is offered under an Apache open-source license with Hazelcast developing commercial software and services around the core technology. Palo Alto, CA. Founded 2008.
- Informatica is, perhaps, the preeminent data integration technology company with its data ETL (extract, transform and load) tools, data quality management software and master data management products. It has continued to expand its technology lineup, including providing its data integration tools through the cloud as an integration Platform-as-a-Service offering. Redwood City, CA. Founded 1993.
- JethroData developed an index-based SQL engine for Hadoop, technology that the company said makes interactive business intelligence with Hadoop possible. Netanya, Israel. Founded 2012.
- MarkLogic has been addressing the big data problem with its NoSQL database before the term big data was invented. San Carlos, CA. Founded 2001.
- MemSQL develops an in-memory database that enables businesses to process transactions and perform business analytics simultaneously, using both realtime and historical data, in a single database. San Francisco, CA. Founded 2011.
- MongoDB develops the open-source NoSQL database of the same name (which comes from "humongous"). It is among the few that have risen above the noise. New York, NY. Founded 2007.
- Neo Technology is the company behind the Neo4j graph database. Graph databases, a type of NoSQL database, use graph structures rather than indexes to represent and store data, a design that boosters tout as massively scalable and more efficient for managing and querying highly connected data. San Mateo, CA. Founded 2007.
- Paxata develops "self-service adaptive data preparation" software that simplifies the often tedious work of transforming raw data so that it can be analyzed with business analytics tools. Redwood City, California. Founded 2012.
- Recommind develops an enterprise search and data categorization platform that organizes, manages and distributes huge volumes of data from multiple sources. San Francisco, CA. Founded 2000.
- SnapLogic provides data integration Platform-as-a-Service (iPaaS) tools for connecting cloud data sources. San Mateo, CA. Founded 2006.
- Splice Machine developed a full-featured, transactional SQL database on Hadoop that can run operational applications and realtime analytics using Hadoop data. San Francisco, CA. Founded 2012.
- Talend has developed an extensive lineup of open-source data management software, including tools for data integration, data quality management, master data management and business process management, as well as an enterprise service bus. Redwood City, CA. Founded 2005.
- Tamr develops enterprise data unification software that businesses use to integrate diverse, siloed data for business analytics tasks. Cambridge, MA. Founded 2013.
- Trifacta develops technology that's used to transform raw, complex data into clean and structured formats for analysis. Trifacta calls it "data wrangling." San Francisco, CA. Founded 2012.
- Xplenty cloud-based, Hadoop-as-a-Service platform integrates and transforms structured, semistructured and unstructured data into analyzable data. Tel Aviv, Israel. Founded 2011.
Original post: 2015 Big Data 100: Data Management