Data Science Interview Guide – Part 1: The Structure

According to one source, the types of questions that will generally be asked in data scientist interviews can be broken down into five categories. Let's take a closer look.



 

Data Science Interview Guide - Part 1: The Structure
Source: WikiBooks

 

Data science is a large field, consisting of a variety of areas. The main aspects are mathematics, computer science, and domain expertise.
According to IGotAnOffer, the types of questions that will generally be asked in data scientist interviews can be broken down into five categories. This has been done by analysing 300+ Data Science interview questions from leading tech companies. 

The five categories are:

  1. Coding (38%) - This will test your problem-solving skills and how you manipulate data using algorithms, SQL, etc. 
  2. Statistics (21%) - This will test your understanding of statistics as a whole and how you have and can apply them within problem-solving tasks.
  3. Machine learning (17%) - This will test your ability to understand the theory behind Machine Learning, how to build models, how it applies to a specific problem/task at hand, and how you can improve it. 
  4. Business-related (12%) - This will test your understanding of technical knowledge and how it can be used to drive business and determine the right decision to make.
  5. Behavioral (13%) - This will test your characteristics and determine if you are a good fit for the company.

 

 Coding

 
The coding aspect of a Data Science Interview takes the highest percentage, at 38%. More than a third of your interview will be based on coding, which is normal as you’re interviewing for a Data Science position. 

Coding questions aim to analyse and evaluate a candidate's proficiency with computer science and its fundamentals. It can cover the following topics:

  • Data Structures: Arrays, Dictionary, Stack/Queues, Strings, Tree/Binary Tree, and more.
  • Algorithms: Binary Search, Recursion, Sorting, and more.
  • SQL: Constraints, Primary/Foreign Key, Join, and more

Example questions would be: 

  • How to reverse an Integer?
  • Write a program that prints the numbers ranging from one to 50, using your chosen programming language
  • What are the different Joins in SQL?

 

 Statistics

 
Statistics is an important element of Data Science. Statistics help Data Scientists analyse large and complex datasets. It is heavily used for Machine Learning in improving models. 

Understanding popular statistical terminology and how it can be implemented within Data Science tasks will help you thrive as a Data Scientist. It can cover the following topics:

  • Probability Distributions
  • Hypothesis Testing
  • Modeling

The topics will be around Probability and Statistics. Example questions would be:

  • Describe A/B testing
  • What is P-value in layman’s terms
  • What is a t-test?

 

 Machine Learning

 
With the use of Machine Learning models in our day-to-day lives, it becomes an important aspect of Data Science and how we can continuously improve it to be implemented into businesses and more. 

Data Scientists are known for solving problems and creating models, therefore during a Data Science interview, the interviewer will test your ability to build models, the overall workflow, and how to improve it. 

It can cover the following topics:

  • Artificial Intelligence
  • Model building, validations, and interpretations
  • Types of Algorithms
  • Use cases of Machine Learning

Example questions would be:

  • What is the difference between supervised and unsupervised learning?
  • List 5 types of supervised learning algorithms
  • What is the difference between bagging and boosting?
  • How do you reduce overfitting of your model?

 

 Business Related

 
The reason that data is so valuable is that it can give people a greater understanding of the data and it can help make important decisions. 

Applying technical knowledge to business case scenarios will help the interviewer understand how you can improve and help grow the company using your skills. 

It can cover the following topics:

  • Performance and limitations of a product
  • Business short-term and long-term goals

Example questions would be:

  • What variable is impacting the decrease in sales of product X?
  • Will variable J improve the performance of product X in the next 6 months?
  • What would you suggest we do to improve the accuracy?
  • Based on the outputs of the metrics, what decision shall we make?

 

 Behavioral

 
Although the majority of technical jobs require heavy hard skills, soft skills are just as important. Your soft skills will determine if you are the right fit for the role. 

During the behavioral stage, articulating yourself using elements of your resume to back your point will make you successful. 

For example, some companies may prefer a candidate who is highly independent and requires little to no interaction. The interviewer will scan through your resume and ask how you had worked in your previous companies and your preferred working method. You will have an understanding that they require an independent employee and can draw out past experiences where you were independent.

Example questions would be:

  • Why did you choose to become a Data Scientist?
  • What do you think you can bring to the company?
  • What was your most successful project and why?
  • What was your most unsuccessful project, why, and how would you have improved it?
  • How do you deal with multi-tasking?

We have gone through the 5 categories of a Data Science Interview, the next step is preparing for the interview. 

 

Data Science Interview Guide - Part 1: The Structure

 

Ace the Data Science Interview is my number 1 Data Science Interview books recommendation. It was written by ex-Facebook employees Kevin Huo and Nick Singh.

Included in this book are 201 real Data Science interview questions which have been previously asked by Facebook, Google, Amazon, Netflix, and more. It includes detailed step-by-step solutions to give you a greater understanding. 

Topics include Probability, Statistics, Machine Learning, SQL & Database Design, Coding (Python), Product Analytics, and A/B Testing.

 
 
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.