A Day in the Life of a Machine Learning Engineer
What does a day in the life as a machine learning engineer look like for you?
Photo by Sigmund on Unsplash
It is good to get a better insight into what other people’s day-to-day looks like. Many students are more focused on the skills, courses, and knowledge level they need to ensure they are as good as they can get.
But sometimes, all you need is to hear it from the horse's mouth. For those of you who have never heard of that idiom, it means If you hear something straight from the horse's mouth, you hear it from the person who has direct personal knowledge of it.
Let’s learn from Ibrahim Mukherjee
Ibrahim Mukherjee is an LSE Graduate in BSc Management (Hons) and a data scientist. After graduating from the LSE in 2008, Ibrahim joined the oil and gas industry as a financial analyst working in Trinidad, Singapore, UK (Aberdeen, Reading, London), Norway, Malaysia, Tunisia, and Romania.
All this while a latent interest in behavioral economics and neuroscience reading the works of Daniel Kahneman (Thinking Fast and Slow) and Dan Ariely (Predictably irrational) has led to a second career in Machine Learning and Artificial Intelligence.
Ibrahim is interested in how the brain abstracts meaning from events and how human cognition differentiates from general machine learning and pattern recognition. Apart from work, Ibrahim likes reading about the philosophy of science, religious psychology, cognitive neuroscience, human responses to stress, Bayesian methods, and writing software.
How Does a Day in Life as an ML Engineer Look Like?
As an ML engineer, I spend a large amount of time working on 3 main tasks:
- Understanding business needs
- Gathering data
- Providing a viable solution to a business problem based on 1&2
Although these may sound relatively easy, as the organization you work for gets large, so does the complexity of the data that is gathered and the results that are generated.
Let’s look at each of these in turn:
Understanding business needs
Cesar Hidalgo in his book, Why Information Grows, makes a very interesting observation – if you look at a city from above when an airline is about to land, it looks remarkably similar to a circuit board inside a computer (a CPU) zoomed in. A city is a computational unit, and so is any business. It can be abstracted as an algorithm – there is an input, some computation where we process that input, and there is an output.
When it comes to a business, the computation is the product or service the business produces. For a barber, the raw material or input may be scissors, the rental space for the barbershop, mirrors, chairs, barricades, etc. the product is a haircut. Money in this case is the store of the value of that output. The higher the quality and/or quantity of output, the higher the value of the output is usually. There are exceptions to this – things like negative externalities of the combustion engine (which may be increasingly taxed by the government), and charities that produce effective goodwill. However, this stands as a general result.
The job of a data scientist, rather than an expert data scientist is to understand the business proposition. What is the input, what is the output?
Then the data scientist would work categorically and systematically to understand the problems in the business. What can improve the offering of the company, improve the price received by the company, or improve the procurement of raw materials, or any aspect of the logistics that starts from the input and ends in the output of the company?
The second thing to look at is gathering data
Before I delve into this, a word of caution for the unwary data scientist best explained in a famous exercise in caution from WW2. When the allied forces were looking to reinforce the planes for bombing raids – they looked at the frequency of bullet holes in the planes that returned. Most data scientists or operations research executives working at the time were of thought that they needed to reinforce the areas of the plane which had more bullet holes.
One Hungarian-born mathematician, Abraham Wald, thought otherwise. He looked at reinforcing the parts of the plane which had no bullet holes. The reason being, that the aircraft that got hit in those areas, never really came back. They were downed.
The plane from the survivorship bias article on Wikipedia
Data, therefore, is only part of the story. Without a good cogent understanding of the mechanics of the business, data doesn’t do much. It can lead to erroneous decisions in large businesses where solutions can be small in scale in terms of the quantum of improvement or efficiency they provide. In those cases, having a solid understanding of the business is critically important.
Data gathering takes the form of speaking to lots of business stakeholders and getting to understand the data in the business. Data can hide very well in silos within the business and it’s the data scientist's job to get to a single source of truth, scour through the different data points provided to understand the data and choose the most relevant and appropriate parts for analysis. Not all data is required and part of the skill is to be able to discern what is important and what is not. To separate the signal from the noise. Adding data incrementally to an existing piece of analysis is always possible so is removing data sets. However, the key is to find a smaller number of variables that are important to solve the business need.
This brings us to the main golden rule. Everything must add value
Businesses in the end are money-making propositions in a capitalist framework. If the analysis does not provide a way to either save money or make money – it’s worthless. That is then not allowed at all. This is important and key to the whole proposition of data science. It should provide a key action point or direction to the management and/or stakeholders to create a monetary value add – either directly in terms of saving costs or making more profit or in “soft” terms such as marketing or CSR.
The data scientist must also be a storyteller. As Steve Jobs said – “the most powerful person in the world is the storyteller”. To be able to communicate the value generated to the business is of huge importance. Unless stakeholders “see” the value – it's almost a moot point that the analysis creates value because they won’t be able to or willing to put it into action.
Storytelling the value proposition is therefore as important as creating the value. A data scientist must therefore be very good at communicating these insights.
Wrapping it up
I would like to thank Ibrahim Mukherjee for taking the time to explain to us a day in his life as a Machine Learning Engineer. Having an understanding of people's approaches to their careers and how they may differ from yours or others is important to improve and better your career
I hope this helps! Thank you again, Ibrahim Mukherjee!
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.