Poll Results: With Big Data, Statistics Will Become More Important
In one of the most lopsided polls on KDnuggets, a big majority of KDnuggets audience said that in the era of Big Data, Statistics will become more important, as the foundation of Data Science.
A big majority (68%) of KDnuggets audience thought that in the Era of Big Data, Statistics will become more important, as the foundation of Data Science. Only 12% thought that Statistics will be less important, based on a poll of 376 votes.
However, I think that these will be new, "Big" statistics, oriented towards stream processing, approximate counting, bayesian reasoning, machine learning, and other topics relevant for Big Data.
|With the trend towards Big Data and Data-driven Machine Learning methods
|Statistics will become more important, as the foundation of Data Science (256)||68%|
|Statistics importance will not change (56)||15%|
|Statistics will become less important (46)||12%|
|Not sure (18)||5%|
Tom Rampley, More stats, but different emphasis
When I think about the stats I learned in school, it was all very research oriented. As such, my stats teachers tended to work on drawing conclusions from small data sets. As the volume of data rises, I think this application will become much less important. What will remain important are statistics regarding model validity, model comparison, and data significance. In my work as a data scientist in industry I have literally never had to worry about having too small a sample size for my conclusions to be statistically significant.
The other big change I see is a shift away from frequentist statistics and towards Bayesian approaches, as computational power gets ever cheaper and the explicability and probabilistic nature of Bayesian stats grows in value. For example, very few marketing execs really understand what a p-value means. But they all can understand an odds ratio, and giving them conclusions they can use to make decisions is what makes data science valuable.
Ross Bettinger, Big Data vs Statistics
I believe that statistical thinking must and will continue to inform Big Data thinking. Even small-sample theory helps to train the Big Data brain regarding venues of investigation.
As Sir Isaac Newton once famously said, "If I have seen further it is by standing on ye sholders of Giants." en.wikiquote.org/wiki/Isaac_Newton
Only G!d can create ex nihilo. We moderns must build up our structures on the foundations of our forbearers, and Big Data thinking will borrow from statistical theory what is necessary to progress.
Salil Kalghatgi, Statistics + DS
me thinks a biased group, eh?
but seriously, i agree "data science as an independent discipline, extending the field of statistics to incorporate 'advances in computing with data' " - William S Cleveland from wiki.
how could the role of stats not increase?
Terry Kaufman, important is as important does
BD diminish importance of statistics? Funny thing is many people consider "statistics" to be lots and lots of numbers, rather than the use of established rules for evaluating data. Seems like BD is lots of data subjected to crunching by many methods but subject to no rules. So we now have more data, but more is not better, and what is important is its value to people other than those who analyze it.