Gregory Piatetsky-Shapiro on Big Data Education in Big Data Innovation Magazine

I talk to Innovation Enterprise about the Big Data skills gap and what companies and universities are doing about it.

By Gregory Piatetsky, Sep 19, 2013.

September 2013 issue of IE.
Big Data Innovation MagazineBig Data Innovation
Magazine includes

  • David Barton looks at how Pamela Peele has built a team around her big data needs
  • Chris Towers talks to Gregory Piatetsky-Shapiro about his take on current big data education and how it could be improved
  • Heather James talks data with Kirk Borne, discussing current issues at college level and before with one of the world's leading big data professors
  • Sean Patrick Murphy, Senior Scientist at John Hopkins University talks to George Hill about his unique perspectives on data and approaches to it in education
  • Daniel Miller talks to Andrew Claster, Deputy Chief Analytics Officer at Obama for America about his use of analytics in the Obama re-election campaign

Here is an excerpt from Big Data Innovation Magazine interview.

I thought I would talk to one of the most knowledgeable and influential big data leaders in the world, Gregory Piatetsky-ShapiroGregory Piatetsky-Shapiro.

After running the first ever Knowledge Discovery in Databases (KDD) workshop in 1989, he has stayed at the sharp end of analytics and big data for the past 25 years. His website and consultancy, KDNuggets, is one of the most widely read data information sources and he has worked with some of the largest companies in the world.

The first thing that I wanted to discuss with Gregory was his perception of the big data skills gap. Many have claimed that this could just be a flash in the pan and something that has been manipulated, rather than something that actually exists.

Gregory references the McKinsey report of May 2011 which quotes:

"There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions."

The report predicts that this kind of skills gap will exist in 2017, but Gregory believes that we are already seeing this. Whilst using to look at what expertise companies are looking for, Gregory found that in the top 10 job trends both MongoDB and Hadoop appear.

"Big Data is actually rising faster than any of them. This indicates that demand for Big Data skills exceeds the supply. My experience with KDnuggets jobs board confirms it - many companies are finding it hard to get enough candidates."

There are people responding to this however, with many universities and colleges recognising not only the shortages, but also the desire from people to learn. Companies looking to expand their data teams are also looking at both internal and external training.

For instance companies such as EMC and IBM are training their data scientists internally. Not only does this mean that they know that they are getting a high quality of training, but that the data scientists that they are employing are being educated in 'their ways'.

With companies finding it hard to employ qualified candidates, through training programs like this, companies can look for great candidates and make sure they are sufficiently qualified afterwards.

The IBMs and EMCs of this world are few and far between. The money that needs to be invested in in-depth internal training is considerable and so many companies would struggle with this proposition.

So what about those other companies? How can they avoid falling through the big data skills gap?

Gregory thinks that most companies have three options. Do you need BIG data? Most companies confuse big data with basic data analysis. At the moment with the buzz around big data, many companies are over investing in technology that realistically isn't required.

A company with 10,000 customers, for instance, does not necessarily need a big data solution with multiple Hadoop clusters. Gregory makes the point that on his standard laptop he would be able to process data for a large software company with 1 million customers.

Companies need to ask if they really need the depth of data skills that they think.

What if you do need it?

For large companies who may need to data sets, the reality is that it is not necessary to employ a big data expert straight from university. Gregory makes the point that somebody who is trained in Mongo DB can become trained as a data scientist relatively easily.

If an internal training programme is not a realistic target in this instance, then external training may become the best option. There are several companies who can offer this such as Cloudera and many others, who can train data scientists to a relatively high standard.

Gregory also mentions that one way in which several companies are learning about big data and analytics is through attending conferences. There are now hundreds of conferences a year on Big Data and related topics, from leaders in the field such as Innovation Enterprise and other smaller conferences all around the world.

Read more.