5 Mistakes I Wish I Had Avoided in My Data Science Career
Everyone makes mistakes, which can be a good thing when they lead to learning and improvements over time. But, we can also try to first learn from others to expedite our personal growth. To get started, consider these lessons learned the hard way, so you don’t have to.
By Tessa Xie, Senior Data Scientist at Cruise.
When I first made the transition from finance to data science, I felt like I was on the top of the world — I got a job in my dream field, my career track is set, I will just keep my head down and work hard, what could go wrong? Well, there were a couple of things… For the following year as a data scientist, there were several mistakes that I’m glad I caught myself making early in my career. This way, I had time to reflect and course-correct before it was too late. After a while, I realized that these mistakes are quite common. In fact, I have observed a lot of DS around me still making these mistakes, unaware that they can hurt their data career in the long run.
If my 5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist were what I learned from the best, the lessons in this article are those that I learned the hard way, and I hope I can help you avoid making the same mistakes.
Mistake 1: Seeing yourself as a foot soldier instead of a thought partner
Growing up, we have always been evaluated based on how well we can follow the rules and orders, especially in school. You will be the top student if you follow the textbook and practice exams and just put in the hard work. A lot of people seem to carry this “foot soldier” mindset into their working environment. In my opinion, this is the exact mindset that’s hindering a lot of data scientists from maximizing their impact and standing out from their peers. I have observed a lot of DS, especially junior ones, think they have nothing to contribute to the decision-making process and would rather retreat to the background and passively implement decisions made for them. This kicks off a vicious cycle — the less you contribute to those discussions, the less likely stakeholders will involve you in future meetings, and the less opportunity you will get to contribute in the future.
Let me give you a concrete example of the difference between a foot soldier and a thought partner in the case of model development. In the data collection and feature brainstorming meetings, the old me used to passively take notes on stakeholders’ suggestions so I can implement them “perfectly” later on. When someone proposed a feature that I knew we didn’t have data for, I would not say anything based on the assumption that they are more senior and they must know something that I overlooked. But guess what, they didn’t. I would later face the situation that 50% of the features we brainstormed would require additional data collection that would put our project deadline at risk. As a result, I often found myself in the undesirable position of the bad-news-bearing messenger in the end. Striving to be a thought partner nowadays, I involve myself early in the conversation and leverage my unique position as the person that’s closest to the data. This way, I can manage the expectations of stakeholders early on and make suggestions to help the team move forward.
How to avoid this:
- Make sure you don’t hold back in meetings in which you can contribute something from the data perspective: are stakeholders’ definitions of metrics sufficient for what they want to measure? Is data available for measuring the set of metrics? If not, can we find proxies for the data we DO have?
- Imposter syndrome is real, especially among junior DS. Make sure you are aware of this, and whenever you are questioning whether you should say something that “others might have already thought of” or ask a “stupid clarifying question,” YOU SHOULD.
- Maintain a level of curiosity about what other people are working on. There are a lot of occasions where I found I could add value by noticing gaps other people may have overlooked due to their lack of understanding of the company’s data.
Mistake 2: Pigeonhole yourself into a specific area of data science
Do I want to be a data engineer or a data scientist? Do I want to work with marketing & sales data or do the geospatial analysis? You may have noticed that I have been using the term DS so far in this article as a general term for a lot of data-related career paths (e.g., data engineer, data scientist, data analyst, etc.). That’s because the lines are so blurred between these titles in the data world these days, especially in smaller companies. I have observed a lot of data scientists see themselves as ONLY data scientists building models and don’t pay attention to any business aspects or data engineers who only focus on data pipelining and don’t want to know anything about the modeling that’s going on in the company.
The best data talents are the ones who can wear multiple hats or are at least able to understand the processes of other data roles. This comes in especially handy if you want to work in an early stage or growth stage startup, where functions might not be as specialized yet, and you are expected to be flexible and cover a variety of data-related responsibilities. Even if you are in a clearly defined job profile, as you get more experience over time, you might discover that you are interested in transitioning into a different type of data role. This pivot will be much easier if you did not pigeonhole yourself and your skillset into the narrow focus of one specific role.
How to avoid this:
- Again, be curious about the projects other data roles are working on. Schedule periodic meetings with colleagues to talk to each other about interesting projects or have different data teams share their work/projects with each other periodically.
- If you can’t get exposure to other data roles at work, try to keep up/practice the data skills you don’t use during your free time. For example, if you are a data analyst and haven’t touched modeling in a while, consider practicing the skills through outside projects like a Kaggle competition.
Mistake 3: Not keeping up with the development in the field
Every soldier knows this, and every DS should, too. Being complacent about your data skills and not putting in the time to learn new ones is a common mistake. Doing this in the data field is more dangerous than in some other areas because data science is a field that’s relatively new and is still experiencing drastic changes and developments. There are constantly new algorithms, new tools, and even new programming languages being introduced.
If you don’t want to be that one data scientist who still only knows how to use STATA in 2021 (he exists, I worked with him), then you need to keep up with the developments in the field.
Don’t let this be you (GIF by GIPHY).
How to avoid this:
- Sign up for online classes to learn about new concepts and algorithms or to brush up on the ones you already know but haven’t used in a while on the job. The ability to learn is a muscle everyone should keep practicing, and being a life-long learner is probably the best gift you can give to yourself.
- Sign up for a DS newsletter or follow a DS blogger/publication on Medium and develop a habit of following the DS “news.”
Mistake 4: Overflexing your analytical muscle
If all you have is a hammer, everything looks like a nail. Don’t be that DS who tries to use ML on everything. When I first entered the world of data science, I was so excited about all the fancy models I learned in school and couldn’t wait to try all of them on real-world problems. But the real world is different from academic research, and the 80/20 rule is always at play.
In my previous article about “5 Lessons McKinsey Taught Me,” I wrote about how business impact and interpretability sometimes are more important than the extra several percentage points of your model’s accuracy. Sometimes maybe an assumptions-driven Excel model makes more sense than a multi-layered neural net. In those cases, don’t over-flex your analytical muscle and make your approach overkill. Instead, flex your business muscle and be the DS who also has business acumen.
How to avoid this:
- Have a full range of analytical skills/tools in your armory, from simple Excel to advanced ML modeling skills, so you can always assess which tool is the best to use in the situation and not bring a gun to a knife fight.
- Understand the business needs before delving into the analysis. Sometimes stakeholders would request an ML model because it’s a popular concept, and they have unrealistic expectations about what ML models can do. It’s your job as a DS to manage the expectations and help them find better and simpler ways to achieve their goals. Remember? Be a thought partner, not a foot soldier.
Mistake 5: Think building a data culture is someone else’s job
In my article “6 Essential Steps to Building a Great Data Culture,” I wrote about how the lives of data scientists can be horrible and unproductive if the company doesn’t have a great data culture. In fact, I have heard a lot of DS complaining about unproductive ad hoc data requests that should be easily handled by stakeholders in a self-sufficient fashion (for example, changing an aggregation from monthly to daily in Looker, which literally consists of two clicks). Don’t think changing that culture is someone else’s job. If you want to see changes, make them. After all, who is better positioned to build the data culture and educate stakeholders about data than data scientists themselves? Helping to build up the data culture in the company will make your life a lot easier down the road as well as your stakeholders.
How to avoid this:
- Make it your responsibility to conduct training for the non-analytical stakeholders and develop self-serve resources.
- Make sure you start practicing what you are preaching, start linking queries to slides, link data sources of truth to documents, and start documenting your code and databases. You can’t build up a data culture overnight, so it definitely takes patience.
I do want to point out that it’s OKAY to make mistakes in your career. The most important thing is to learn from those mistakes and to avoid them in the future. Or even better, write them down to help others avoid making the same mistakes.
Original. Reposted with permission.
Bio: Tessa Xie is an experienced Advanced Analytics Consultant skilled in data science, SQL, R, Python, Consumer Research and Economic Research with a strong engineering background following a Master's degree focused in Financial Engineering from MIT.
- How a Single Mistake Wasted 3 Years of My Data Science Journey
- Data Scientists think data is their #1 problem. Here’s why they’re wrong.
- Learning from 3 big Data Science career mistakes