The Data Science Interview Study Guide
Preparing for a job interview can be a fulltime job, and Data Science interviews are no different. Here are 121 resources that can help you study and quiz your way to landing your dream data science job.
By Ben Rogojan, SeattleDataGuy.
Data science interviews, like other technical interviews, require plenty of preparation. There are a number of subjects that need to be covered in order to ensure you are ready for backtoback questions on statistics, programming, and machine learning.
Before we get started, there’s one tip I’d like to share.
I’ve noticed that there are several types of data science interviews that companies conduct.
Some data science interviews are very product and metric driven. These interviews focus more on asking product questions like what kind of metrics would you use to show what you should improve in a product. These are often paired with SQL and some Python questions.
The other type of data science interview tends to be a mix of programming and machine learning.
We recommend asking the recruiter if you aren’t sure which type of interview you will be facing. Some companies are very good at keeping interviews consistent, but even then, teams can deviate depending on what they are looking for. Here are some examples of what we have noticed about some companies' data science interviews.
Airbnb — Product heavy, metrics diagnostics, metrics creation, A/B testing, tons of behavioral questions, and takehome material.
Netflix — Productsense questions, A/B testing, experimental design, metric design
Microsoft — Programming heavy, binary tree traversal, SQL, machine learning
Expedia — Product, programming, SQL, product sense, machine learning questions about SVM, regression and decision tree
Due to this variance, we’ve created a checklist to keep track of what subject areas you studied and what you still need to cover.
Let’s first start by making sure you can explain the basic data science algorithms.
Machine Learning Algorithms
 Logistic Regression — Video
 A/B Testing— Video
 Decision Tree — Post
 SVM — Post
 How SVM — Video
 Principal Component Analysis: PCA — Post
 Principal Component Analysis — Video
 Adaboost — Post
 AdaBoost — Video
 A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning — Post
 Gradient Boost Part 1: Regression Main Ideas — Video
 KMeans Clustering — The Math of Intelligence — Video
 Bayesian Network — Post
 Neural Network — Post
 Dimensionality reduction algorithms — Post
 How kNN algorithm works — Video
Probability And Statistics
At large tech companies, it is common to receive an occasional probability or statistics question. While the questions won’t necessarily require complex math if you haven’t thought about independent and dependent probabilities in a while, then it is good to review setting up the basic formulas.
Probability Videos
 Dependent probability introduction
 Independent & dependent probability
 Independent Problems
 Conditional Prob Article
Probability Quiz
 Probability & Statistics — Set 6
 Probability & Statistics — Set 2
 Independent Probability
 Dependent Probability
Probability Interview Questions
Most of these questions are either similar to the ones we have been asked or taken directly from glassdoor.com.
 A die is rolled twice. What is the probability of showing a 3 on the first roll and an odd number on the second roll?
 In any 15minute interval, there is a 20% probability that you will see at least one shooting star. What is the probability that you see at least one shooting star in the period of an hour?
 Alice has 2 kids and one of them is a girl. What is the probability that the other child is also a girl? You can assume that there is an equal number of males and females in the world.
 How many ways can you split 12 people into 3 teams of 4?
Statistics Prequiz
Statistics Concepts
Statistics is a broad concept so don’t get too bogged down in the details of each of these videos. Instead, just make sure you can explain each of these concepts at the surface level.
 BiasVariance TradeOff
 Confusion Matrix
 ROC curve
 Normal Distribution
 PValue
 Pearson Spearman
 Normal distribution problem: zscores (from ck12.org)
 Continuous Probability Distributions
 Standardizing Normally Distributed Random Variables (fast version)
 Statistics 101: Simple Linear Regression, The Very Basics
 Statistics 101: Linear Regression, Outliers, and Influential Observations
 Statistics 101: ANOVA, A Visual Introduction
 Statistics 101: Multiple Regression, The Very Basics
 Statistics: Variance of a population  Probability and Statistics  Khan Academy
 Expected Value: E(X)
 Law of large numbers  Probability and Statistics  Khan Academy
 Central limit theorem  Inferential statistics  Probability and Statistics  Khan Academy
 Margin of error 1  Inferential statistics  Probability and Statistics  Khan Academy
 Margin of error 2  Inferential statistics  Probability and Statistics  Khan Academy
 Hypothesis testing and pvalues  Inferential statistics  Probability and Statistics  Khan Academy
 Onetailed and twotailed tests  Inferential statistics  Probability and Statistics  Khan Academy
 Type 1 errors  Inferential statistics  Probability and Statistics  Khan Academy
 Large sample proportion hypothesis testing  Probability and Statistics  Khan Academy
 Boosting and Bagging
Statistics Postquiz
Product And Experiment Designs
Product sense is an important skill for data scientists. Knowing what to measure on new products and why it can help determine whether a product is doing well or not. The funny thing is, sometimes certain metrics going the way you want them to might not always be good. The reason people are spending more time on your website might be because webpages are taking longer to load or other similar userfacing problems. This is why metrics are tricky and what you measure is important.
Product And Experiment Design Concepts
 User Engagement Metrics
 Data Scientist’s Toolbox: Experimental Design  Video
 A/B Testing Guide
 Multivariate Testing
 6 Themes Of Metrics
Product And Metrics Questions
 An important metric goes down, how would you dig into the causes?
 What metrics would you use to quantify the success of Youtube ads (this could also be extended to other products like Snapchat filters, Twitter livestreaming, Fortnite new features, etc)
 How do you measure the success or failure of a product/product feature
 Google has released a new version of its search algorithm, for which they used A/B testing. During the testing process, engineers realized that the new algorithm was not implemented correctly and returned less relevant results. Two things happened during testing:
 People in the treatment group performed more queries than the control group.
 Advertising revenue was higher in the treatment group as well.
What may be the cause of people in the treatment group performing more searches than the control group? There are different possible answers here.
Question 4 borrowed from Zarantech; We really enjoyed it and thought it was a good example of how things can go wrong.
Programming
Just because data science doesn’t always require heavy programming, it doesn’t mean that interviewers won’t ask you traverse a binary tree. So make sure you ask your interviewer what to expect. Don’t be daunted by these questions. Pick a few to do just so you’re not surprised in an interview.
Prevideo Questions
Algorithms And Data Structures
Prestudy Problems
Before going through the video content about data structures and algorithms, consider trying out the problems below. This will help you know what you need to focus on.
 Sum of Even Numbers After Queries
 Robot Return to Origin
 NRepeated Element in Size 2N Array
 Balanced Binary Tree
Data Structures Videos
 Data Structures & Algorithms #1 — What Are Data Structures?
 Multidim (video)
 Data Structures: Linked Lists
 Core Linked Lists Vs Arrays (video)
 Data Structures: Trees
 Data Structures: Heaps
 Data Structures: Hash Tables
 Data Structures: Stacks and Queues
Algorithm Videos
 Python Algorithms for Interviews
 Algorithms: Graph Search, DFS and BFS
 BFS (breadthfirst search) and DFS (depthfirst search) (video)
 Algorithms: Binary Search
 Binary Search Tree Review (video)
 Algorithms: Recursion
 Algorithms: Bubble Sort
 Algorithms: Merge Sort
 Algorithms: Quicksort
String Manipulation
 Coding Interview Question and Answer: Longest Consecutive Characters
 Sedgewick — Substring Search (videos)
SQL
Poststudy Problems
Now that you have studied for a bit and watched a few videos. Let’s try some more problems!
 Bigger Is Greater
 ZigZag Conversion
 Reverse Integer
 Combination Sum II
 Multiplying Strings
 Larry’s Array
 Short Palindrome
 Valid Number
 Bigger is Greater
 The Full Counting Sort
SQL — Problems
Generally, there will be at least one interview focused on SQL. In addition, the interviewers may take you through the entire process of developing a product, choosing metrics to track and then querying to measure the effectiveness of that metric.
 Trips and Users
 Human Traffic of Stadium
 Department Top Three Salaries
 Exchange Seats
 Hackerrank The Report
 Nth Highest Salary
 Symmetric Pairs
 Occupations
 Placements
 Ollivander’s Inventory
SQL — Videos
 IQ15: 6 SQL Query Interview Questions
 Learning about ROW_NUMBER and Analytic Functions
 Advanced Implementation Of Analytic Functions
 Advanced Implementation Of Analytic Functions Part 2
 Wise Owl SQL Videos
Post SQL Problems
 Binary Tree Nodes
 Weather Observation Station 18
 Challenges
 Print Prime Numbers
 Big Countries
 Exchange Seats
 SQL Interview Questions: 3 Tech Screening Exercises (For Data Analysts)
Conclusion
Technical interviews can be tough. Whether they are for software engineers, data engineers, or data scientists. We do hope this study guide helps you keep track of your progress!
If there is something you think we left off or you have additional resources that you think would be a benefit, please let me know. Thank you!
Original. Reposted with permission.
Bio: Ben Rogojan is a Seattlebased Data Scientist & Engineer with extensive experience designing ETL pipelines, databases, websites, and other software products for startups and established corporations. Ben currently works as a data engineer at a health analytics company.
Related:
Top Stories Past 30 Days

