Interview: Alessandro Gagliardi, Glassdoor on the Indispensable Skills for Data Scientists
We discuss Analytics at Glassdoor, important lessons, major factors affecting job satisfaction, challenges of working on Twitter Data, indispensable components of Data Science education.

Alessandro received his B.A. in Computer Science from UCSC and pursued a Ph.D. in Behavioral and Neural Science at Rutgers until moving back to California in 2010. He taught neuroscience and psychology at USF and CIIS before returning to industry as a Data Scientist in 2011. Since then, he has been working on bringing his knowledge of biological computation to the field of data science all while training the next generation of Data Scientists.
Update (April 6, 2014): Alessandro is no longer working at Glassdoor. Currently, he is Lead Professor of Data Science for GalvanizeU's Master of Engineering in Big Data.
Here is my interview with him:
Anmol Rajpurohit: Q1. What does Glassdoor do? How important is Analytics at Glassdoor?

Analytics is very core to what we do. In fact, Glassdoor's primary product offering is data in the form of reviews and salaries. We use analytics to support internal business decisions, like A/B testing, strategy & pricing, etc.; to drive data products - such as Reviews Highlights or Salary Medians and for Employer Insights - such as how their ratings vary by job function and office locations.
AR: Q2. What are the typical problems that the Data Science team at Glassdoor works on?
AG: It varies a lot. Our projects range all over from detecting fraudulent reviews, detecting predictors of salaries, supporting A/B tests, to reporting on the health of the company. There really is no "typical problem" which is part of what makes being a data scientist at Glassdoor so interesting!
AR: Q3. What were the key lessons that you learned from the experience of extrapolating from the known to the unknown (for example, the cases with insufficient salary data)?

One lesson might be this: don't let the perfect be the enemy of the good. Your predictions will never be perfect and there will always be ways to improve it, by drawing upon more outside data, including other factors, and so on.
It's a lot easier to come up with ways in which to improve a model than it is to actually execute on those improvements which means that, unchecked, a project like that can grow out of control. It's important to check in, set expectations that this will be continuous and gradual process, congratulate yourself for the progress you've made and ship early.
AR: Q4. Besides salary, what other factors play a crucial role in job satisfaction?

AR: Q5. A lot of your research is based on social conversations data from Twitter. What are the most underrated challenges of working with Twitter data?

AR: Q6. Based on your experience as an instructor at General Assembly, what do you consider as the indispensable components of a good Data Science curriculum?

Other skills I also see neglected a lot are SQL and statistics. SQL is not used much in academia, so a lot of academics-turned-data scientists can find themselves disoriented with tasks that their colleagues would find elementary. But relational databases are here to stay and being able to use them effectively is definitely a required skill for anyone who would call themselves a data scientist. Stats are also important, and it goes beyond simply knowing how to run a chi-square test. A data scientist needs to cultivate an intuition about probability and statistics so they know when not to believe it when their computer tells them something is significant below the p <.001 threshold.
Second part of the interview will be published soon.
Related: