Interview: Michael Stonebraker, greatest living contributor to database technology

Michael Stonebraker, described as the greatest living contributor to database technology, on how he adjusts to the award and what trends he foresees in database management systems and big data.

Here is the Interview in People of ACM series
with Michael StonebrakerMichael Stonebraker, the recipient of the 2014 ACM Turing Award for fundamental contributions to the concepts and practices underlying modern database systems. Throughout his career, he has proven the practical application of his research, founding several successful companies to commercialize his work, including Ingres (acquired by ASK and then CA), Illustra (acquired by Informix), Cohera (acquired by PeopleSoft), Streambase (acquired by TIBCO), Vertica  (acquired by HP), VoltDB, Paradigm4, and Tamr.

He is now adjunct professor at the MIT CSAIL Lab where he is co-founder and co-director of the Intel Science and Technology Center for Big Data. Previously, he was a professor of CS at UC Berkeley for 29 years.

Michael is a person I greatly admire. For those of you who are not members of ACM, here is the ACM Interview:

ACM: How has your recognition as an ACM Turing Award recipient affected your research and your future direction?

MS: So far, I am still absorbing the fact that I won. I am just putting one foot in front of the other, and continuing with day-to-day activities. I am involved in two large research projects and three external companies that keep me very busy. Previous award winners have said they spent quite a bit of time giving talks, so I expect to "be on the road" during the next year. I am sure that experience will give me new perspective and ideas.

How do you feel about being described by Curt Monash as "the greatest living contributor to database technology," and what trends do you foresee in relational database management systems (RDBMS)?

Aw, shucks.... I think the RDBMS market is in a "watershed" transition at the present time. The DBMS market, in general, has expanded from just business data processing to a much broader marketplace with more diverse requirements. In addition, in most major markets, newer systems have a substantial competitive advantage over the legacy implementations from the major vendors. For example, in the data warehouse market, column stores have largely replaced the legacy row stores as the dominant architecture. I expect similar transitions in other markets. Put differently, I think we live in very interesting times...

As a "serial entrepreneur" of technology companies, how important do you think it is for computing innovators to engage in both research and practice in the big data field?

In the DBMS research space the ultimate judge of good ideas is the commercial marketplace. What differentiates "the rubber meets the road" from "the rubber meets the sky" is real-world usage. As such, it is crucial to talk to real-world DBMS users and find out what their issues are and then try to solve them. I would strongly advise DBMS researchers to do just that.

What insights can you share on the balance between security and privacy in light of today's extensive practice of data mining by both government and private entities?

I am not an expert in this field, but that never stops me from making comments. Our cell phones geo-position us whenever they are on. Most of our transactions (credit cards, utility bills, etc. ...) are recorded somewhere. Our Fitbits record our activity. The list goes on and on. As such the amount of data recorded on each of us is mind-numbing, and is largely owned by private companies or the government. It is clearly a legislative agenda item to restrain the use of such data.

As an innovator of database technology, what advice would you give to students who are pursuing careers in this burgeoning field?

I think "data science" will be incredibly important as analytics move from conventional business intelligence (SQL aggregates) to more complex technology (regression, predictive models, data clustering, etc. ...). In addition, alternate processors (FPGAs, GPUs, etc. ...) are something for DBMSs to try to take advantage of. Main memory will continue to get larger and larger while non-volatile RAM will appear on the scene in a few years. Peta-scale data warehouses will become commonplace. Hence, I expect "big data" to be important for a long time to come. My advice to students: get well trained in the up-and-coming techniques and technologies.