KDnuggets Home » News » 2015 » Apr » Opinions, Interviews, Reports » Interview: Michael Li, Data Incubator on Data-driven Hiring for Data Scientists ( 15:n12 )

Interview: Michael Li, Data Incubator on Data-driven Hiring for Data Scientists

We discuss the launch of the Data Incubator, its business model, why we need data-driven hiring, selection process for the incubator program and alumni feedback.

michael-liDr. Michael Li is Executive Director at The Data Incubator. Michael has worked as a data scientist (Foursquare), quant (D.E. Shaw, J.P. Morgan), and a rocket scientist (NASA). He did his PhD at Princeton as a Hertz fellow and read Part III Maths at Cambridge as a Marshall scholar.

At Foursquare, Michael discovered that his favorite part of the job was teaching and mentoring smart people about data science. He decided to build a startup that lets him focus on what he really loves.

Here is my interview with him:

Anmol Rajpurohit: Q1. How and when did you get the idea to launch the Data Incubator? What is the business model?

Michael Li: The Data Incubator is a 7 week fellowship to help masters students, PhDs, and postdocs transition from academia into industry data science roles.  The program is free and the tuition is paid for by partner hiring companies (including EBay, Palantir, Pfizer, and the New York Times).  For more information, visit: http://www.thedataincubator.com/ or read about fellow experience on our blog http://blog.thedataincubator.com/.

The idea is born out of my frustration having been on both sides of the hiring table.  While interviewing, I realized that companies (across a wide-range of data science industries) would often ask the same technical interview questions.  As a hiring manager, I was surprised by how many people with strong resumes were unable to answer these basic questions.  I figured it would be more efficient for someone to ask those questions to candidates just once. That way, we help aspiring data scientists identify and build their skills and help companies find top talent.  We accept fellows who have the raw brainpower and give them a framework to analyze terabytes of data and save hiring managers time and resources in hiring.

big-data-talent-searchAdditionally, there is a huge skills gap in data science.  Nationally, 80% of the growth in STEM jobs will come from computing and mathematics (BLS) but they make up less than 3.5% of undergraduate STEM degrees (NSF).  McKinsey is estimating that there’s a 140,000 - 190,000 shortage of data scientists.  I’m really glad we’re able to help address this.

AR: Q2. What do you mean by "Data-driven hiring for Data Scientists"?

ML: Hiring, even for data scientists, is often not data-driven.  Like with data-driven-hiringother forms of hiring, it relies on resume keyword scanning -- the corporate analogy of tea-leaf reading.  A lot of smart, talented people who are not good at writing resumes are missed in this process and it still lets through too many people who shouldn’t make it.  With challenge questions, in-depth interviews, and code reviews, we assess actual performance in advance of any subsequent interview.

The other major flaw of most hiring is that a lot of policies are based on small sample sizes.  Because we are working with thousands of applications each cohort, we’re able to get a lot more visibility into these stats than many hiring companies.

AR: Q3. Can you describe the selection process for this program? What are the important characteristics you are looking for in the pool of applicants?

ML: We look for people who have a solid foundation in mathematics (statistics, linear algebra, etc…) and computation.  PhDs often get this from their research and coursework.  The latter is mostly the ability to hack around, munge data, and get things done on a computer and is often a self-taught skill.  Finally, we also look for people who can communicate complex, technical ideas to a general audience.

AR: Q4. A few batches have already graduated from the Data Incubator program. What is the common feedback you are getting from alumni?

graduationML: The transition from academia is a challenge.  They love how our curriculum provides a streamlined framework for analyzing big data problems.  Our Fellows have found the immersion into industry culture and direct exposure to a variety of hiring partners, data scientists, and data science jobs valuable.  You can also read some blog entries written by alumni here (1, 2, 3, 4, 5).

Second and last part of the interview