What makes a data scientist today? Consider this review of data collected from three years worth of data scientist LinkedIn profiles to gain insight into how this important new career path is shaping up.
For the last few years, we at 365 Data Science have been trying to answer one big question: “What makes a data scientist?”
Since we are talking data science, the only logical way to approach the question is to ask the data. And that’s what we’ve done for 3 consecutive years. Since 2018 we have explored 1,001 data scientist LinkedIn profiles to uncover the most interesting trends in the data science field.
In this article, we will go through the most important findings of our research. In fact, we have created an interactive dashboard that you can use to analyze the data yourself.
Click around the charts to filter the report by Education, Area of Studies, Years on the Job, and Programming Languages.
Hold CTRL to select more than one bar in each chart.
Use Year and Country as slicers for the whole report.
The study investigated 1,001 data scientist LinkedIn profiles. The research relied on self-reported data posted by professionals on their LinkedIn accounts. The assumption is that a LinkedIn profile is a good proxy of a professional’s resume.
Furthermore, company and country quotas were assigned to limit bias. The cohort was divided into two groups depending on whether a person was employed by a Fortune 500 Company or not. In addition, the sample involved data scientists working in the US (40%), UK (30%), India (15%), and other countries (15%). Convenience sampling was employed due to data accessibility limitations.
According to the data, the average data scientist from 2018-2020 is a male with a second-tier degree, coming from a quantitative background, which is not necessarily data science or computer science. Their preferred programming language is Python, but they’d often know R and SQL. Many of the new data scientist positions are being filled by people who are already data scientists, so the field seems much more saturated. Getting into data science still looks like a great opportunity, but the ‘data scientist’ job role becomes more and more exclusive.
Now, let’s dive deeper into the education, years of experience, and programming skills of a data scientist from 2018 to 2020.
From the sample, we can see that that at least 80% of the people held a Master’s degree at a minimum.
Education in 2020.
This isn’t as surprising, considering data science is a field that expects advanced know-how from the person — usually achieved by graduate or postgraduate types of education, or advanced independent research in other cases.
And while specialization is important, a Ph.D. is not a requirement for breaking into data science. Indeed, over the years, the number of Ph.D. holders has remained consistent, making up about 27% of our study.
On the contrary, starting from 2018, there has been a rise of about 20% of professionals with a Master’s degree compared to the 2019 cohort.
Area of Studies
Area of Studies in 2020.
In 2018 and 2019, “Economics and Social Sciences,” “Computer Science,” and “Statistics and Mathematics” were filling up the top 3 most popular fields of study among data scientists. 2020 is the first year ever that features “Data Science and Analysis” (22%) as the top degree. Therefore, we can assume that universities have started to catch up with the demand for data science education.
Graduates from the Engineering, Natural Sciences, and Other fields constitute approximately 11% of our data each. In fact, this indicator has remained stable throughout the years.
Interestingly, in 2020, most women in our sample have earned a ‘Statistics and Mathematics’ related degree (24% of the female cohort). In comparison, the most common degree among men is Data Science and Analysis (22%), with Computer Science (19%) being a close second.
Years on the job
Years on the Job in 2020.
If you are changing jobs or working through your data analyst years, you must be wondering whether you’ve got the right experience for the position. In terms of tenure, in 2018 almost all data scientists were ‘newcomers’ to the role. Some of this was driven by name changes to their occupation, but mostly – the supply was so little that it was way easier to enter the field.
Currently, we are observing a much tougher playing field. The majority of data scientists have more than 2 years on the job, and it seems that a very small proportion of the total data scientist pool is new. In fact, in 2020, 52% of the cohort held the title ‘data scientist’ at their previous position.
Programming Languages in 2020.
The programming skills a data scientist needs are arguably the most interesting area of research (at least for us). For many years, R was the preferred language a data scientist was expected to “speak.” In 2018 and 2019, Python started ‘eating away at R.’ And it did so at a very fast pace. In 2020, we have reached the point where Python is by far the preferred programming language in the data science community with a 74% adoption! R is not completely overthrown but becomes less and less favored among the professionals.
An interesting development is the rapid year-to-year growth of SQL users. In 2020, more than 50% of data scientists actively use the language. One common assumption is that companies expect from a data scientist to solve all their data-related problems, whether in the data engineering or data architecture domain. On another note, the adoption of BI software, such as PowerBI and Tableau has also demanded a higher understanding of databases. Inevitably, SQL also had to be added to the data scientist toolbelt for the sake of ‘getting the job done.’
Looking at the data, the answer to “What makes a data scientist?” becomes clearer as professionals are paving the way, and universities are starting to provide a more tailored education. From a career point of view, it seems that it is getting harder to become a data scientist, as the satisfaction rate is high, and data scientists tend to stay on their job for a longer period. However, different opportunities to get into the field remain, because demand still varies across countries and industries. One thing is for sure – learn Python, if your goal is to become a data scientist!