KDnuggets Home » News » 2015 » Mar » Opinions, Interviews, Reports » Interview: Dave McCrory, Basho on Distributed Database Needs of a Future Enterprise ( 15:n09 )

Interview: Dave McCrory, Basho on Distributed Database Needs of a Future Enterprise

We discuss the future of distributed storage for enterprise, Scale-up vs. Scale-out, software design patterns in Cloud era, microservices model and the place for legacy database in modern enterprise IT.

Dave McCroryDave McCrory is Chief Technology Officer at Basho Technologies. Dave most recently served as SVP of engineering at Warner Music Group, where he led over 100 engineers building the company’s new Digital Services Platform, based on an open source enterprise platform as a service. His extensive experience in the cloud and virtualization industry included positions as a senior architect in Cloud Foundry while at VMware and as a cloud architect at Dell.

Earlier in his career, he experienced successful exits for two companies he founded: Hyper9 (acquired by SolarWinds) and Surgient (acquired by Quest Software). Dave is well known for inventing the concept and coining the term “Data Gravity,” which states that as data accumulates, there is a greater likelihood that additional services and applications will be attracted to this data and add to it.

Anmol Rajpurohit: Q1. In the near future, where do you think the next oductivity boost will come from, for Distributed Storage in the Enterprise?

Dave McCrory:

The next real boost will come from the aggregation and integration of what have traditionally been disparate database-related functions, including object storage, search, queues and caching, into an easy-to-use, unified platform.

We’ve seen a sort of system sprawl over the last few years that requires too many people to understand too wide a variety of solutions and technologies in order to manage and take advantage of the massive amounts of data being generated. Providing a single view into all of that, along with the ability to manage all of the core functions from a central point, will drive a productivity explosion, in terms of both development and operations.
AR: Q2. What has been the most significant impact of increasing abundance of choice for Distributed Storage over the last few years? What are the important factors in managing the trade-off between "Scale Up" and "Scale Out"?

DM: What people are finding is that there is no single distributed storage solution that can solve all of their problems. However, they are now combining solutions to solve all sorts of complex data problems. Many of these problems were intractable just a few years ago. Companies are now able to process and understand massive amounts of data much faster than in the past. Because of this, we are seeing much larger data environments that are spread out to many geos and are scaled out inside of the data center.

Scaling out adds more nodes to a system versus scaling up, which adds resources to a single node in a system. While scaling up adds incremental costs, it doesn’t have a major impact on solution design, complexity, etc. On the other scale-up-vs-scale-outhand, scaling out not only adds incremental costs, but also complexity and sometimes even requires a redesign of the solution. Managing the trade-offs between the two can be difficult, however, the benefits of scaling out are often the better solution when it comes to large volumes of data and processing. The abundance of inexpensive hardware and the level of performance/concurrency achievable are higher when you scale out.

AR: Q3. What are your thoughts on the evolution of software design patterns in this "Cloud" era? What sort-of basic understanding do the Enterprise IT managers need in order to leverage the new technology effectively?

software-design-patternDM: Software design patterns in the Cloud are very different than “traditional” software design. This is because of basic assumptions made in the past regarding underlying infrastructure. Previously, software design patterns focused on synchronous connections, reusability and modularity (which are still valid), except now on Cloud, you must move to supporting asynchronous connections and stateless approaches. Stateless meaning that the software persists everything externally to non-local storage.

Enterprise IT Managers should learn some simple concepts to ensure they know what they are getting into by moving to Cloud-y architectures and solutions. Some simple concepts being:
  • Idempotency
  • Asynchronicity
  • Stateless
  • Software design patterns such as the circuit breaker pattern

AR: Q4. Do you believe Microservices model has a promising future? Are there any concerns that need to be addressed soon?

DM: Yes, I believe Microservices have a promising future because of how simple and powerful this way of design is. Each Microservice is tightly focused on doing a single thing well and this makes them more easily maintained. The downside is that you end up with very large numbers of Microservices and you must decide if you will evolve each or create derivative services based on need. Then there is integration between Microservices, which can become complex, which leads to my concern; operations and management of Microservices at volume will be a problem that needs to be addressed.
AR: Q5. Given the inertia of Enterprise IT (due to limited budget and other factors), it does not seem that the SQL/Legacy systems are going away any time soon. What would be an ideal use for such systems as companies move towards a hybrid of legacy and new-generation solutions?

DM: In Enterprise IT, I believe that there will be a percentage (my guess is 10-15%) of solutions that still require a transactional-oriented RDMBS/SQL solution. These will remain into the relational-databasesfuture until this type of construct is built into other solutions (if that even makes sense). For all of the other SQL/Legacy systems, they will be in a transitional state where they are used by legacy applications. Making the move will take a decade or more, but the constraints on businesses stuck on Legacy SQL systems will force them to move. In hybrid environments and during these transitional states, synchronizing between SQL tables and a bucket looks to be a viable approach.

Second part of the interview