7 Data Analytics Interview Questions & Answers
Most asked non-technical, operational, and SQL interview questions for data analytics jobs.
Image by Author
The data analytics interviews are divided into multiple parts, such as non-technical, technical, and SQL. The hiring manager will assess your knowledge of the statistical tools and concepts. Furthermore, you will be asked situational questions where you have to explain how you prepared an analytical report, cleaned the data, or came up with graph interpretation.
In this blog, we will go through 7 questions that are changeling and frequently asked during the data analytics interview.
Non-Technical Questions
1. How do you explain technical concepts to a non-technical audience?
In this question, the interviewer is judging your communication, presentation, and people skills. Being able to explain technical concepts to managers or clients is a skill.
Apart from technical terms such as mean, correlation, or data distribution, you also need to learn more about data and its features. Try to connect dots that make sense for a business. You need to make sure you understand the business and audience to explain concepts in layman's terms.
2. Top 3 metrics to define the success of this product, what, why, and how would you choose?
To answer this question, you need domain knowledge of industry, business, and the product. You can ask an interviewer to tell you about the company strategy and vision, which help you formulate the answer.
For the social media product, the 3 metrics can be daily active users, number of users adding friends in the first 2 weeks, and the number of posts in a week. It is based on the company's vision and product strategy. So, it is always better to research the company before sitting for an interview.
Technical Questions
3. What is descriptive, predictive, and prescriptive analytics?
Descriptive analytics provides insights into the past to answer the question such as “how did the marketing camping perform compared to last year”
Predictive analytics is about using insight to predict future events or forecast growth.
Prescriptive analytics is used to suggest various courses of action to prevent disaster or to improve the product.
4. What are the various steps involved in any analytics project?
This question is totally up to you. In general, data analytical projects consist of understanding the problem statement, gathering the data, cleaning the data, exploring, analyzing, and visualizing the data, and finally interpreting the results for the non-technical audience. You can also mention tools, techniques, and additional steps based on specific problems.
5. How to handle missing values in a dataset?
There are various ways to handle missing data. The most used method is dropping the missing values rows if the dataset is large and balanced.
Apart from that, you can:
- Drop the columns with missing values
- Fill it with constant.
- Average and median imputation. You will be replacing the missing values with the average or median value of the column.
- Use multiple-regression analyses to estimate a missing value
- Consider multiple columns to replace missing values with average simulated values and random errors.
SQL Questions
6. Create a SQL query to retrieve duplicate records from employee_details neglecting the primary key and EmpId.
The solution is simple. You will select the required columns and Count(*). After that, group it by unique identification, such as employee name, manager id, joining date, and city. We will then use HAVING to filter duplicates. If the Count(*) value is greater than one, then it is a duplicate.
You can apply the same strategy to any table. Make sure you are grouping tables by multiple unique id columns such as name and address.
Solution:
SELECT fullname, managerID, joining_date, city, COUNT(*) FROM employee_details GROUP BY fullname, managerID, joining_date, city HAVING COUNT(*) > 1;
7. Write an SQL query to find out how many users inserted more than 1000 but less than 2000 images in their presentations
Table: event_log
user_id | event_date_time |
1255 | 1535308433 |
4566 | 1535308444 |
9566 | 1535308476 |
… | … |
The solution is simple but tricky. First, you have to count the number of images per user and then count the number of users with more than 1000 images but less than 2000.
The inner query will count event_date_time and group it by user_id to find a unique user id with a number of images per user. After that, create an outer query to filter out users with more than 1000 but less than 2000 images and count them.
Solution:
SELECT COUNT(*) FROM ( SELECT user_id, COUNT(event_date_time) AS image_per_user FROM event_log GROUP BY user_id AS image_per_user WHERE image_per_user < 2000 AND image_per_user > 1000;