Data Science and its relationship to Big Data and Data-Driven Decision Making
Two leading data science and machine learning researchers examine the relationship of data science to other important related concepts, and to begin to identify the fundamental principles underlying data science. What will be Big Data 2.0?
Big Data Journal, Foster Provost and Tom Fawcett, Feb 2013.
Companies have realized they need to hire data scientists, academic institutions are scrambling to put together data-science programs, and publications are touting data science as a hot--even 'sexy'--career choice. However, there is confusion about what exactly data science is, and this confusion could lead to disillusionment as the concept diffuses into meaningless buzz.
In this article, we argue that there are good reasons why it has been hard to pin down exactly what is data science. One reason is that data science is intricately intertwined with other important concepts also of growing importance, such as big data and data-driven decision making; another reason is the natural tendency to associate what a practitioner does with the definition of the practitioner's field; this can result in overlooking the fundamentals of the field.
It is important (i) to understand the relationship of data science to other important related concepts, and (ii) to begin to identify the fundamental principles underlying data science. Once we embrace (ii), we can much better understand and explain exactly what data science has to offer.
Furthermore, only once we embrace (ii) should we be comfortable calling it data science. In this article, we present a perspective that addresses all these concepts. We close by offering, as examples, a partial list of fundamental principles underlying data science.
The article, part of Big Data Journal Inaugural Issue, is available at:
Foster Provost is a Professor and NEC Faculty Fellow at New York University Stern School of Business.
Tom Fawcett is a leading researcher in Machine Learning.
F.P. and T.F. are authors of the forthcoming book, Data Science for Business.