Crushed it! Landing a data science job
Data scientist interviews depend on the company and the team, it might look like a software developer’s interview, or statistician’s interview. Here we collected some hot tips to pass along if you’re thinking about a move soon.
By Erin Shellman.
After two amazing years with the Nordstrom Data Lab, I’ve accepted a research scientist position at Amazon Web Services to work on S3. I’m excited to begin a new chapter of my career, and relieved that the interview process is over because it’s grueling and time-consuming. Interviews typically consist of one to three screener conversations and then an all-day on-site, and they’re stressful because it’s hard to know what you’ll be asked and often you’re expected to perform feats of intellect that you don’t typically do as a data scientist (at least not devoid of context, from memory and over the phone).
You need time
The best piece of advice I can offer is that if you’re thinking of moving on (or moving into the field) start preparing now. You want to give yourself a lot of time and not be in cram mode. Take the time to be sure that you can explain core concepts in your own words. Screening questions are commonly phrased like this: “how would you explain to an engineer how to interpret a p-value?” Explain it to an engineer, someone who, presumably, isn’t a statistician and might not be used to that language. You don’t want it to be the first time you’ve had to rephrase basic definitions like that. Also, don’t underestimate what nerves can do to your ability to recall information, even stuff you really thought you understood. If you’re new to the field, you might need to give yourself more time to prepare if a lot of the concepts are unfamiliar to you.
I also highly recommend spending time on the preparation of your professional materials, i.e. résumé and cover letter. There are two camps on this it seems, those who think it matters and those who don’t. Do interviewees look at that stuff with any detail? Hard to say generally, but I did a ton of interviews at Nordstrom and I can tell you that I was very critical of résumés and letters. Typos are unacceptable, a letter where the applicant brags is a red flag, weak materials indicate lack of interest (or lack of respect for the reader), and keyword stuffing was an open invitation for me to ask about where and why the applicant applied the methods. In the broader technology industry I think people tend to believe the myth that all anyone cares about is what you’ve got up on GitHub, but most companies, big companies, don’t look at your GitHub, they look at your résumé and cover letter (this might also come as a shock, but technology isn’t a meritocracy). Ultimately these documents are how you’ve chosen to represent yourself professionally, so they should matter to you even if you think they don’t matter to your interviewers.
If you didn’t try it, you probably don’t know it
I recommend doing a lot of practice problems and being very analytical about your weak areas. Many falsely believe that reading and rereading is an effective study strategy, but it isn’t effective when you’re required to solve probability zingers and logic puzzles live (highly recommend Make it Stick, maybe before you study). By mindfully doing practice problems you’ll know immediately where you’re weak and that will help you prioritize how you study. Wasting time on stuff you know pretty well is a procrastination strategy, and I thought you were busy? Also, this is a technical field and you should be prepared to answer questions at a substantially technical level. If possible, I recommend doing practice problems standing at a whiteboard just to make sure you’re comfortable writing that way and speaking while you write. You can find lots of tips and interview questions on Quora.
Office setup for my first round of interviews while still a PhD student at the University of Michigan. I was very green, transitioning out of my field, and terrified of not knowing something. This level of obsession is not healthy nor recommended.
Learn as much as you can about the role ahead of time
Did you know that an informational interview is a thing? Until a friend used this strategy, I didn’t either! Sometimes you’ll find that the interview process got going, but you’re not even sure you want the job. You can tell them to slow their roll and just do an informational interview where you can learn more about whether pursuing the job is something you want. Also take the time to “stalk” the company and people you’ll be interviewing with. For example for my AWS onsite I looked up everyone I’d be interviewing with and spent some time on LinkedIn understanding their background. This can sometimes help you guess what types of things they’re liable to ask you. Oh, she’s an engineer so she probably won’t ask about stats, but she’ll want to hear about scaling methods up. Wait, but she’s a principal engineer, so maybe she’ll actually want to hear more about my leadership and inter-personal abilities. Ellen Chisa’s got a lot of great tips on what not to do in an interview as well.
On to the resources!
You can reasonably be expected to be asked about the following topics: Statistics, Machine Learning, Forecasting, Algorithms, everything an undergrad CS major should know, and then the scalability and performance associated with all those things. Oh also, you should be prepared to program, typically in a language of your choice. Easy peasy right?!
It probably doesn’t matter which one, but get an intro probability book. I used my trusty old Ross, a standard undergrad text in probability. If you have Ross, I recommend doing the self-tests in chapters 1 – 5 and using those tests to help you decide where to spend more time. Combinatorics and basic probability questions are the norm for phone screens so make sure you’re comfortable doing them. I also used Casella and Berger, basically the Bible for statisticians, to review the properties of expectations and variance. Generally I’d say that text is probably more advanced than is required in most interviews.
For the CS related topics I primarily used Programming Interviews Exposed, Cracking the Coding Interview and Programming Pearls. Exposed is definitely the most comprehensive of the books and if you only have time to look over one, go with that one. Cracking is very succinct and specific to the interview processes at the big boy companies like Amazon, Google and Facebook but isn’t super generalizable. The version I was using also had some really irritating vignettes about making sure you’re “a guy they want to get beer with” that were so bro-y I quit using the book (I expected more from Gayle). Pearls is not an interview book at all. It’s a collection of problems in computing and mental narratives of approaches to solving them. This book isn’t really for studying as much as it is for reasoning about computing and it’s a great read if you’ve got time.
Coursera is literally the shit. If you got rid of your old textbooks or don’t want to buy anything, you can easily get by with the material on Cousera. I hiiiiighly recommend the Biostatistics bootcamps from Johns Hopkins. They are an excellent review of the first year of a graduate level statistics program. Don’t spend too much of your time watching the lectures. Instead test yourself with the quizzes and assignments and watch the videos in areas where you are weak. Also check out the data science specialization which is offered from the same folks and covers applied skills like exploratory data analysis and programming in R. Andrew Ng‘s machine learning course is a must and is quite enjoyable. He does a great job of motivating methods and spends a lot of time building intuition which is very valuable for phone screens where you might not jump into technical details but still need to demonstrate familiarity. The cloud computing specialization was also great for me since I was gunning for the job at AWS. I’m transitioning industries again from retail technology to cloud computing and I wanted to get a better sense for the types of problems that I’d be expected to discuss. In this case I just watched videos so I could absorb the language people use to describe the field rather than focus too much on the technical details. I’m always on the prowl for great classes on Coursera, so if you have recommendations leave them in the comments!
Coursera used to make me crazy because they enforce this antiquated notion of start and end times. I recently discovered that many courses allow you to view archived lecture materials so you can learn the material without having to wait for the class to start. This was a game changer for me, so check it out.
That’s about all I’ve got in terms of the tangibles. But I’ll leave you with a couple platitudes. First, stay calm! You won’t be able to recall your knowledge when you’re all keyed up. This is something I have trouble with, which is why I do crazy things like write down everything I know and tape it to the wall, but that’s not recommended behavior. My new crazy strategy is to do a bunch of jumping jacks a couple minutes before a screener call so that I’m sweaty and out of breath. Also, if you’re local, ask to do screeners in person. I give great face, and I’ve found that I do a lot better when I can see the interviewer than I do over the phone.
Don’t forget you’re interviewing them too and trust your initial impressions. I had an informational interview with a start-up and left with the feeling that the interviewer was arrogant and not really listening to what I was saying, but I thought the work seemed interesting. I did a follow-up and all my reservations were confirmed a million times over. It was a terrible experience and a total waste of my time that could have been avoided if I’d trusted my gut feeling that these people were douches. Interesting work isn’t worth spending a minimum of 8 hours a day with people who won’t respect you.
Finally, try not to compare your experience to those of others, because you might have it wrong and it might just bum you out. I happened to be interviewing at about the same time as a number of colleagues whom I know well. I was pretty shocked and, at the time, angry about my experience compared with some of theirs. Without going into specifics, I interviewed the same week for the same job in the same office as a male colleague with less experience. He got to do his screener in person with someone from the team he’d be on and was asked very rudimentary questions about dice roll probabilities. I had to do my screener on the phone with someone from a different office and was asked to find the optimal strategy of a game theory problem. It’s hard to hear that and not read into it, and it’s harder not to not be angry. Now I interpret that inconsistent interview experience with poor recruiting practices and company-wide immaturity. I don’t want to work somewhere that doesn’t know how to interview for my role and as a result probably hires people I don’t want to work with.
In the end you should prepare as much as you can, but don’t fret if you feel like there are holes in your knowledge. Trust yourself, trust your impressions, and learn from those bad interviews so that you can crush it in the next one.
Bio: Erin Shellman is a statistician + programmer working as a research scientist at Amazon Web Services – S3. Before joining AWS, she was a Data Scientist in the Nordstrom Data Lab where I worked in the area of personalization, building product recommendations for Nordstrom.com.
- Stop Hiring Data Scientists Until You’re Ready for DataScience
- Michael Li, Data Incubator on Data-driven Hiring for Data Scientists
- How To Become a Data Scientist And Get Hired
- 5 questions to decide if you need a data scientist