Why You Should Consider Being a Data Engineer Instead of a Data Scientist
A new king of the jungle has emerged.
Photo by Ryan Harvey on Unsplash
I just want to say that whether you choose data science or data engineering should ultimately depend on your interests and where your passion lies. However, if you’re sitting on the fence, unsure of which to choose because they are of equal interest, then keep reading!
Data science has been a hot topic for a while, but a new king of the jungle has arrived — data engineers. In this article, I’m going to share with you several reasons why you might want to consider pursuing data engineering over data science.
Note that this IS an opinionated article and take what you want from this. That being said, I hope you enjoy!
1. Data engineering is fundamentally more important than data science.
We’ve all heard the saying “garbage in, garbage out”, but only now are companies starting to truly understand the meaning of this. Machine learning and deep learning can be powerful but only in very special circumstances. Aside from the fact that there needs to be a substantial amount of data and a practical use for ML and DL, companies need to satisfy the data hierarchy of needs from the bottom up.
Image created by Author
The same way that we have physical needs (i.e. food and water) before social needs (i.e. the need for relationships), companies need to satisfy several requirements which generally fall under the data engineering umbrella. Notice how data science, specifically machine learning and deep learning, are the very last things that matter.
Simply put, there can be no data science without data engineering. Data engineering is the foundation for a successful data-driven company.
2. The demand for data engineers is growing… by a lot.
Like I previously said, companies are realizing the need for data engineers. Hence, there is a growing demand for data engineers at the moment and there’s proof.
According to Interview Query’s Data Science Interview report, the number of data science interviews only grew by 10% from 2019 to 2020, while the number of data engineering interviews grew by 40% in the same period of time!
As well, Mihail Eric conducted an analysis on Y-Combinator job postings and found that there were roughly 70% more data engineering roles for hire than data scientist roles.
You might be wondering, “sure the growth is much higher, but what about in terms of absolute numbers?”
I took the liberty of webscraping all Data Scientist and all Data Engineer job postings from Indeed, Monster, and SimplyHired, and I found that the number of job listings is about the same for both!
Overall there were 16577 data scientist job listings and 16262 data engineer job listings.
Image created by Author
3. Data engineering skills are extremely useful as a data scientist.
In more established companies, the work is typically segregated so that data scientists can focus on data science work and data engineers can focus on data engineering work.
But this is generally not the case for most companies. I would say that the majority of companies actually require their data scientists to know some amount of data engineering skills.
A lot of data scientists end up requiring data engineering skills.
It’s also incredibly beneficial to know data engineering skills as a data scientist and I’ll give an example: If you’re a business analyst that doesn’t know SQL, you’ll have to ask a data analyst to query information every time you want to gather insights, which creates a bottleneck in your workflow. Similarly, if you’re a data scientist without the fundamental knowledge of a data engineer, there will certainly be times when you’ll have to rely on someone else to fix an ETL pipeline or clean data as opposed to doing it on your own.
4. Data science is easier to learn than data engineering.
In my opinion, it’s much easier to learn data science as a data engineer than learn data engineering skills as a data scientist. Why? Well there’s simply more resources available for data science, and there are a number of tools and libraries that have been built to make data science easier.
And so, if you’re starting out your career, I personally think it’s more worthwhile investing your time learning data engineering than data science because you have more time to invest. When you’re working a full time job and a couple of years into your career, you might find that you don’t have the capacity or energy to invest as much time in learning. So from that perspective, I think it’s better to learn the harder realm first.
5. It encompasses an untapped market of opportunities.
I’m not just talking about job opportunities, but opportunities to innovate and make data engineering easier with new tools and methodologies.
When data science was initially hyped up, people found several barriers to learning data science, like data modeling and model deployment. Later, companies like PyCaret and Gradio emerged to solve these problems.
Currently, we are in that initial stage with data engineering, and I foresee a number of opportunities to make data engineering easier.
Thanks for Reading!
While this is an opinionated article, I hope that this sheds a bit of light as to why you may want to be a data engineer. I want to reiterate that whether you choose data science or data engineering should ultimately depend on your interests and where your passion lies. As always, I wish you the best of luck in your endeavors!
Not sure what to read next? I’ve picked another article for you:
and another one!
- If you enjoyed this, follow me on Medium for more
- Interested in collaborating? Let’s connect on LinkedIn
- Sign up for my email list here!
Original. Reposted with permission.
- Want to Be a Data Scientist? Don’t Start With Machine Learning
- 7 Most Recommended Skills to Learn to be a Data Scientist
- The Most In-Demand Skills for Data Scientists in 2021