Nine lessons learned during my first year as a Data Scientist
What is it like to be a Data Scientist? There can be many hats to wear, and so many problems to solve that are fed with data, churned by data science, and guided by business results. Find out about lessons learned from one Data Scientist about how best to work and perform in the role.
By Jacob D Peters, Founder, Commsor.
Full disclosure, I don’t know if I consider myself a true Data Scientist. In fact, I would argue that there is no true, universally accepted definition of a Data Scientist — the job title is a victim of overuse with its meaning muddied by a deluge of marketing hype and buzzword mania. I like to view myself as a Problem Solver, where data is my language, data science is my toolkit, and business results are my guiding force.
Some days I do ‘data science,’ performing exploratory analyses or building machine learning models. However, some days, I am more of a management consultant and strategist, working with business leaders to make data-driven decisions. Some days I am even a data engineer, helping onboard new data sources or architecting our big-data technology stack. In fact, most days, I am a combination of all of the above. Such is the life of a problem solver responsible for all things data.
I like to view myself as a Problem Solver, where data is my language, data science is my toolkit, and business results are my guiding force.
I am part of a growing, internal innovation team that employs strategy and analytics with the mission of increasing revenue and reducing costs for our company. As the team matures and scales, roles will become more clearly defined and specialized. However, for now, I am content to wear many hats and maintain my involvement in virtually every facet of the data lifecycle. This post documents nine lessons that I have collected during the past year, and I am excited to share them with you.
1. Supplement your work with a diverse portfolio of extra-curricular learning
A curious nature, growth mindset, and self-starter attitude are just as important for a Data Scientist as is his or her technical skill-set. However, the breadth and depth of the data science field can sometimes feel overwhelming.
This is why, outside of my daily workload, I follow a structured learning plan to further my knowledge in areas such as AI, big data technology, mathematical and statistical techniques, and programming skills.
Your mind makes the deepest connections when exposed to the same concepts across different mediums, coupled with the right balance of information consumption and hands-on learning by doing. Try to take advantage of this paradigm by diversifying your data science learning curriculum: watch YouTube videos, take MOOC online courses on Coursera, browse articles, read books, take the occasional highlighter to an academic white-paper, listen to podcasts on your commute, practice coding, work on applied side projects, and participate in Kaggle competitions. Most importantly, pick the brains of mentors, colleagues, or friends — I have personally grown the most from hearing other people’s unique perspectives.
This might sound like a lot, but you don’t have to capitulate your social life in the name of data science. Try replacing a few hours of Netflix binging or Instagram-feed scrolling each day with learning and self-improvement. The only barrier to entry is your own time and commitment.
2. Always have a “North Star” and tie every project to a specific business outcome
Know the goals of your business and let them guide your every move. Whether it’s acquiring customers, increasing market share, or reducing costs, align every analytics project or task you do to a direct business goal.
A machine learning model by itself is not a solution. You must build models or conduct analyses with the end in mind. Before you write the first line of code, quantify the potential impact of a project, justify the value, and explicitly tie it to a business outcome.
3. Punch above your weight
What does boxing have to do with data science? As businesses become more data-centric, junior analysts and data scientists will be given the opportunity to interact with managers and senior stakeholders who have employed anecdotal and intuition-based decision making processes for decades, sans data.
In order to ensure your messages hit home, carry yourself with a confidence level and swagger beyond your years, and elevate your mental presence (albeit respectfully). Always punch above your weight.
4. Relationships matter
At the end of the day, data and numbers are an abstraction. Machine learning algorithms may be better at detecting and processing patterns than humans, but they still lack the ability to reason. In most business situations, the best decisions come from the marriage of human intuition or field knowledge and data intelligence.
Build trust by always communicating the limitations of data or a model upfront. Never discount the power or importance of people and relationships.
5. Walk a mile in their shoes
Whether you are working with research analysts, salespeople, marketers, or executives, empathizing with your stakeholders and understanding their daily workflow and pain-points is the first step to being a successful Data Scientist.
There is a reason that “business understanding” is the first step of the CRISP-DM framework, the Cross-industry standard process for data mining.
The CRISP-DM Framework for data mining.
Also, identifying your stakeholders before you begin any data science project and keeping them involved early and often will yield more productive results than building a model or creating a process in the dark.
6. Never overhype yourself
After Harvard Business Review declared ‘Data Scientist’ as the sexiest job of the 21st century, data science has been evangelized and is often hailed as the holy grail solution to any business problem.
Harvard Business Review.
People working in data-related roles today have so much information available at their fingertips that it is easy to feel like the master of the universe or the smartest person in the room. Fight the temptation and maintain a humble attitude.
One of J.P. Morgan CEO Jamie Dimon’s most important principles of success is that “Great leaders have humility and authenticity, not smarts and unbridled ambition.”
7. Don’t forget to block and tackle
In American football, a team is only as good as its players’ abilities to block and tackle, the basic fundamentals of the game. The same is true for analytics practitioners and data scientists — you are only as good as your data. While not as sexy as building machine learning models, you can’t eschew the basic fundamentals of data governance, data stewardship, and data cleanliness.
Always double-check for duplicate values after SQL joins, consider how unforeseen missing or null values might affect your analysis, and be cautious of spurious correlations. Whenever possible, try to benchmark your numbers or derived calculations against gold-standard industry reports or widely accepted internal metrics. Gut-check everything.
8. Be aware of your organization’s maturity model
There is no point in building a model if your organization is not ready to adopt it. You have to be realistic about adoption and implementation. There is nothing worse than spending 80 hours perfecting a deep learning model if your organization lacks a process and resources to implement it. The reality is that not every organization is equipped to quickly productionalize and efficiently scale analytics solutions.
And keep in mind that not every problem needs a machine learning algorithm. Sometimes simple rules-based heuristics are okay too.
9. Question everything
Some of the world’s most brilliant minds of all time — Philosopher Aristotle, Inventor Thomas Edison, Technological Evangelist Elon Musk, embody a style of thinking known as “first principles,” in which you question everything and hypothesize about the world without prior context.
According to Elon Musk, “There’s a good framework for thinking… You know, the sort of first principles reasoning. Boil things down to their fundamental truths and reason up from there, as opposed to reasoning by analogy.
Through most of our life, we get through life by reasoning by analogy, which essentially means copying what other people do with slight variations.”
As a Data Scientist, the best solutions come from first principles thinking. Question why processes in your organization aren’t automated, question why certain organizational decisions are not made using data, question why metrics are calculated the way they are, question if your strategy will be viable in two years. Question everything — intellectual curiosity will be your biggest differentiator.
Original. Reposted with permission.
- Advice for New and Junior Data Scientists
- Learning from 3 big Data Science career mistakes
- How to land a Data Scientist job at your dream company