Data Scientists Thoughts that Inspire

Inspirational thoughts from leading data scientists, including Yann LeCun, Erin Shellman, Daniel Tunkelang, Claudia Perlich, and Jake Porway. What inspires you?

By Andy Rey (Happy Data Scientist)
1. Yann LeCun
Director of AI Research at Facebook
Yann LeCun
  • Most of the knowledge in the world in the future is going to be extracted by machines and will reside in machines

  • There are just not enough brain cells on the planet to even look or even glance at that data, let alone analyze it and extract knowledge from it

  • Knowledge is some compilation of data that allows you to make decisions, and what we find today is that computers are making a lot of decisions automatically

  • Diversity of point of view is a very important thing

  • You don’t want to just hire clones of the same person, because then they will all want to explore the same things. You want some diversity

  • The idea that somehow you can put a bunch of research scientists together and then put some random manager who’s not a scientist directing them doesn’t work. I’ve never ever seen it work

  • Management skills are a little overrated in the sense that managing research scientists is like herding cats

  • The only way to make intelligent machines was to get into learning, because every animal is capable of learning. Anything with a brain basically learns

  • It’s useful for a company to have its scientists actually publish what they do. It keeps them honest

  • The data sets are truly gigantic. There are some areas where there’s more data than we can currently process intelligently

  • The amount of human brainpower on the planet is actually increasing exponentially as well, but with a very, very, very small exponent. It’s very slow growth rate compared to the data growth rate


2. Erin Shellman

Statistician and data scientist in the Nordstrom Data Lab

  • Data’s just the world making noises at you

  • As a data scientist, even if you don’t have the domain expertise you can learn it, and can work on any problem that can be quantitatively described

  • The most interesting types of data are those collected for one purpose and used for another

  • Presentation is the ability to craft a story

  • Presentation skills are undervalued, but is actually one of the most important factors contributing to personal success and creating successful projects

  • If you talk to somebody who has something you want, follow up

  • What companies want is a person who can rigorously define problems and design paths to a solution

3. Daniel Tunkelang
Head of Search Quality at LinkedIn
  • The best way to become a data scientist is to do data science
  • Anything that looks interesting is probably wrong
  • Intuition is really a well-trained association network
  • As data scientists, our job is to extract signal from noise
  • Query understanding offers the opportunity to bridge the gap between what the searcher means and what the machine understands
  • Search is the problem at the heart of the information economy
  • Where things get interesting is in the details
  • Our goal is to fail fast. Most crazy ideas are just that: crazy
  • It’s easy to be lazy and look at aggregates. Drilling down into the differences and looking at specific examples is often what gives us a real understanding of what’s going on
  • One thing we’ve learned is that there’s no such thing as over-communicating
  • Technology is like exercise equipment in that buying the fanciest equipment won’t get you in shape unless you take advantage of it
  • Always put talent before technology
  • Data scientists need to have strong critical-thinking skills and a healthy dose of skepticism
  • Failure is a great teacher
  • Experience is not only the best teacher, but also perhaps the only teacher
  • Our computers, mobile devices, and web-based services are witnesses to many of our daily decisions

4. John Foreman
Chief Data Scientist at Rocket Science Group
  • You don’t have to know everything, but you should have a general idea
  • E-mail data is powerful, because as a communications channel it generates more revenue per recipient than Social Channels
  • Twitter is probably the best place to start conversations about data science
  • Talking to users is crucial because they point you in the right direction
  • What we focus on, and this is going to sound goofy for a data scientist - is the happiness of our users
  • Vendors are there to sell you a tool for a problem you may or may not have yet, and they’re very good at convincing you that you need it whether you actually need it or not
  • I find it tough to find and hire the right people
  • Data scientists are kind of like the new Renaissance folks, because data science is inherently multidisciplinary
  • It’s essential for a data science team to hire people who can really speak about the technical things they’ve done in a way that nontechnical people can understand
  • If you’re solving problems appropriately and you can explain yourself well, you’re not going to lose your job