Addressing the Growing Need for Skills in Data Science
To address the current difficulties in hiring data scientists due to their short supply, many companies can benefit from retraining existing analytically minded employees.
In an often cited white paper, the IDC predicted that the “Global Datasphere will grow from 33 Zettabytes in 2018 to 175 Zettabytes by 2025.”  (To put those numbers into perspective, Cisco forecasts that in the year 2019 there will be a total of 2.4 Zettabytes in IP traffic worldwide .) Being able to extract knowledge from this rapidly growing wealth of data and thereby positively influence business decisions is key to success in a increasing number of industries.
Data scientists are the leaders of this crusade to convert data into insight, but unfortunately companies that want to expand their data analytics capacity often experience difficulties in hiring data scientists with the needed competences and skills due to their high demand. Allen Blue, LinkedIn co-founder, recently declared that the demand for data scientists was so high because they “are almost all employed”. He also mentioned that this demand is not limited to “high-tech and software realms ... the past three years have seen ’massive growth – 15 times, 20 times growth’ in data science-based jobs in sectors like education, marketing and manufacturing.”  In fact, according to the ESADE, “companies expect to increase the size of their analytical staff considerably, tripling the number of data scientists and multiplying the number of data managers by a factor of 2.5 over the next three years.” 
On June 27, 2019, a round table discussion at the BDV PPP Summit in Riga, Latvia gathered an international panel of experts on skills in data science to address this growing demand for data scientists. Due to the widely recognized difficulty in hiring data scientists, the discussion focused mainly on issues related to the retraining of existing employees to meet needs in data science.
Retraining, a Win-Win Situation
Ernestina Menasalvas, Universidad Politécnica de Madrid, began the discussion by stating that due to the large value that data scientists provide, companies will benefit from investing in the retraining of existing employees simply because those investments are easily returned. Jean-Christophe Pazzaglia, SAP, mentioned that the retraining in data science of a domain expert with an existing analytical mind set can result in an employee with unique skills who is very valuable for companies. He also pointed out that retraining also has the advantage, when compared to hiring, of continuing existing relationships of trust between employer and employee, something which can be of particular interest when the data being analyzed or the decisions being made are considered to be strategic or confidential.
Liesbeth Ruoff-van Welzen, KNVI and IP3, felt that employees who retrain in data science have new career paths opened to them (both with their existing and other potential employers) which can often include financial incentives and opportunities for promotion.
Retraining is not just of interest for employees interested in moving their careers in the direction of data analytics but is needed at all levels. For managers to make informed decisions based upon insight gleaned from data analytics they need to have an understanding of the techniques used to produce that information, and in particular the risks associated with the use of those techniques.
The discussion also included the opinion that employee training should not be limited to just technical skills, leadership training among managers is also of interest. A good manager can work with many different kinds of employees and individually mold them to meet the needs of the company.
Who can provide affordable training?
All panelists agreed that new and innovative training methods are needed. When talking about large companies, Yuri Demchenko, University of Amsterdam, felt that retraining is always more economically advantageous than hiring (paying for one in-house course can benefit many employees). But especially in the case of small companies, it was felt that high quality and especially low cost training is greatly needed.
Yuri also mentioned that to build a sustainable competences and skills management practice, it is important to have well defined competences and a professional profiles framework, such those defined by the EDISON Data Science Framework . Also it is important to ensure compliance with the European e-Competence Framework  and the European Skills, Competences, Occupation classification . He stated that due to the complexity and rapid development of new products and services, vendors pro-actively offer tutorials and training for those new products. They also often provide resources and support for education and training, but this support is limited only to their particular products when in many cases more generic and flexible training is needed.
Liesbeth proposed that universities should look more into offering retraining for people already in the workforce. Thomas Hauser, University of Colorado Boulder, mentioned that some universities are pairing with companies to provide training for their employees in exchange for funding. He gave the specific example of the collaboration between the University of Florida and Disney .
Thomas also proposed that online training can be very useful, especially when trying to train users who, due their physical locations, are hard to reach. Lastly, the idea of mobile learning in the form of short “pills” was considered to be an interesting option for working professionals.
Jean-Christophe commented that many large companies, like SAP with https://open.sap.com/, provide in-house and online training in the form of MOOC. However the dropout rate of these courses is often very high. Many people abandon the classes after just 2-3 weeks as it is difficult for participants to keep up with the pace and dedicate the half day or more required per week while meeting the demands of their day-to-day work. Rocco Defina, Oxys Consulting, emphasized the fact that a crucial requirement for success in a training program is the motivation on the part of the employees to invest the required time and effort to learn. This is especially the case with online training as it requires much more motivation when compared with traditional courses. He said that in this sense small and medium size companies have an advantage over large ones as they can do a better job of motivating their employees. This is because it is easier for them to know and understand their employees and thus can provide more effective and individualized motivations for retraining. He also added that the retraining of staff also offers the opportunity to tailor the training to the context and on the strategic needs of the company/organization. This tailoring should also reduce the drop-out of the trainees and increase the return of the investment, because adult learners are more motivated when they can associate the training to their current position and use cases.
Yuri said that the majority of free MOOC courses provided in the form of micro-modules do not provide the academic rigor and level required by most employers. Some high quality online educational resources exist but their costly fees might restrict their use. Companies should encourage their employees to take advantage of those resources by providing reimbursement for expenses when that training matches the company’s expected learning outcomes and time constraints.
Until the supply of unemployed data scientists increases, many companies can benefit from retraining existing analytically minded employees. Even when the supply of unemployed data scientists increases it could still be interesting to retrain due to possible savings in costs and the advantage of banking upon existing trusted employees. When retraining is offered, it is important to ensure that employees feel motivated to learn, especially when that learning is done online. It could even be useful to contractually establish continuing education programs where: employers commit to paying expenses, employees commit to learning, and both commit to allocating the time needed for the training.
 CEN/TC 428. (2016) European e-competence framework 3.0. [Online]. Available: http://www.ecompetences.eu/
 Cisco. (2019, February) Cisco Visual networking index: Forecast and trends, 2017-2022. [Online]. Available: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html
 EC DG Employment, Social Affairs and Inclusion and Cedefop. (2017) European skills/competences, qualifications and occupations (ESCO). [Online]. Available: https://ec.europa.eu/esco/portal/
 ESADE. (2018, May) Companies expect to triple their big data staff in the next three years. [Online]. Available: https://www.esade.edu/en/news/esade-study-companies-expect-triple-their-big-data-staff-next-three-years/18053
 D. Reinsel, J. Gantz, and J. Rydning. (2018, November) The digitization of the world from edge to core. IDC. [Online]. Available:
 The EDISON Project. (2018, December) Edison data science framework (EDSF), release 3. [Online]. Available: https://github.com/EDISONcommunity/EDSF
 University of Florida. (2018, September) Expanded UF online partnership will provide education benefits to more than 80,000 hourly disney employees nationwide. [Online]. Available: https://news.ufl.edu/articles/2018/09/expanded-uf-online-partnership-will-provide-education-benefits-to-more-than-80000-hourly-disney-employees-nationwide.html
 Wharton School. (2019, March) What’s driving the demand for data scientists? [Online]. Available: https://knowledge.wharton.upenn.edu/article/whats-driving-demand-data-scientist/
Ernestina Menasalvas is a Full Professor at the Universidad Politécnica de Madrid and lead of the Skills and Education Task Force of the Big Data Value Association.
Ana M. Moreno is a Full Professor at the Universidad Politécnica de Madrid. Her research is in the area of Software Engineering.
Nik Swoboda is a Contract Professor at the Universidad Politécnica de Madrid where he researches diagrammatic reasoning, data representation and visualization.
All the authors are members of the BDVe Project consortium.
- Top 7 Things I Learned in my Data Science Masters
- I wasn’t getting hired as a Data Scientist. So I sought data on who is.
- Which Data Science Skills are core and which are hot/emerging ones?