InformationWeek 9 NoSQL Pioneers Who Modernized Data Management

The age of Big Data wouldn’t have been possible without these NoSQL. Learn more about NoSQL Pioneers, who changed the data landscape and revolutionized the big data movement.

Here is a summary of InformationWeek slideshow by Charles Babcock.

The folks profiled here are tackling data management for the Internet Age, helping us all understand what can be done with a mass of unstructured information. See how their work has transformed the way we handle databases.

(Image: danleap/iStockphoto)

Doug Cutting, the original author of Hadoop with Michael Cafarella. At the time he co-created Hadoop, Cutting was working at Yahoo on Nutch, a crawler-based search engine for indexing the Web, when he hit upon combining his batch-sorting system with MapReduce and allowing it to scale out to much larger capacities. As a name, Hadoop has no significance as an acronym. Cutting named it for a stuffed toy elephant in his family.

Avinash Lakshman and Prashant Malik are two giants in the field. Lakshman was co-inventor of Dynamo at Amazon. Dynamo would later be cloned to form the basis of the Voldemort NoSQL project. Lakshman left Amazon in 2007 to join Facebook, where he met Malik. The pair created Cassandra there that year and drew their inspiration from both Dynamo and Google’s Big Table. Lakshman left Facebook in 2011 and started Hedvig a year later in 2012.

Dwight Merriman, co-founder and CEO of 10Gen, the firm that sponsored the project developing the document-oriented MongoDB system, didn’t really need to add another feather to his cap. Merriman was already co-founder and CTO of Doubleclick, where he architected the online advertising management system DART, sold to Hellman and Friedman in 2005 and resold to Google in 2007 for $3.1 billion. Doubleclick serves tens of billions of ads a day.

Damien Katz, a former Lotus Notes developer at IBM, first aired in an April 2005 blog that he was working on a “large scale object database” that would be a lightweight, document-oriented system. Katz self-funded the project through 2005-2006, until it became in February 2007 an Apache open source project. Katz, meanwhile, was serving as CTO of CouchOne, the firm he founded to support CouchDB.

Jonathan Ellis was an early implementer of Cassandra at Rackspace, a company that supplies managed services and Infrastructure-as-a Service. He became the first outside committer to the Cassandra project in 2009 when it was being formed in the Apache incubator and he remained a frequent contributor. Ellis originally called the firm, Riptano, and had Rackspace as a financial backer. It was renamed DataStax in January 2011. Ellis serves as chairman of the Cassandra project at the Apache Software Foundation and CTO of DataStax.

John Quinn, VP of engineering at Digg, caused NoSQL doubters – defenders of traditional relational systems — to question themselves when he announced the Digg social networking site was in the midst of a move off MySQL and onto Cassandra. He did it in a blog titled, “Saying Yes to NoSQL; Going Steady with Cassandra,” which circulated widely in March 2010.

Salvatore Sanfilippo developed the Redis in-memory, NoSQL system for use in connection with his two small technology businesses. The code took on a life of its own, a community formed around it, and Sanfilippo, accustomed to working on code from his home on Sicily, was pressured to start a company around Redis. He declined, wanting to reserve some time to spend with his family.

Andy Gross is the chief architect and co-author of Riak, a decentralized document database system that uses MapReduce. Gross has statedRiak was inspired by Amazon’s Dynamo. Riak scales easily and predictably by enlarging its server cluster. Riak is the system of Basho Technologies; Gross was Basho’s chief architect from 2007 through March 2014. He left to become a software engineer with Twitter, then founded Opsee in February 2015. In July, Gross became principal software engineer of Gofactory.

Brad Fitzpatrick was the creator of at his company, Danga Interactive, and an author of the Memcached software that powered it. LiveJournal was an online personal journaling system for use by people around the globe. In 2007 Fitzpatrick went to work for Six Apart as chief architect. Several of the other contributors to Memcached, a distributed, object caching system, formed NorthScale to build out the key value store system, Membase, that was used beneath Memcached as persistent storage.