The 8 Most Common Data Scientists
Admit it all you wanna-be, newbie, and old-old-school Data Scientists on the planet, whether you like it or not, you've probably behaved like one of these types. Or two. Or all eight.
By JABDE, Journal of Astrological Big Data Ecology.
For those new to dealing with engineers and data scientists, it’s very tough to understand what flavor statistician you may be dealing with, how to manage them, and how much to trust them. Statistics can be closer to an art than a science. When it comes to the different types of statisticians, there are almost as varied as sexual identities. Many statisticians and analysts can have multiple identities, but they all generally follow the Dunning-Kruger Effect. Experience and confidence are not one-to-one relationships.
Here is a guide to the top 8 most common types of data scientists.
1. The Unashamed Frequentist
This type of statistician worships the p-value and is often the most confident of their answer. You can spot these analysts by a lack of predictive behavior and overuse of null hypothesis testing. This flavor of analyst will report what happened and not much else. While the unashamed frequentist is easy to understand and works the fastest, they are very boring and prone to accidental and sometimes intentional p-hacking.
2. The Data Bro
Ready for the new hotness of the most popular statistical library? The data bro typically does not have a heavy math background, but they know how to code by google. The easiest way to spot a data bro is by their flashy presentations, fancy visualizations, and their blind use of open-source statistics libraries. The data bro is fantastic at presentations and, under the right circumstances, can quickly turn around a beautifully intuitive data product. Unfortunately, the data bro is the most susceptible to using faulty online libraries developed by non-statisticians or misapplying statistical tests because the graph just looked too cool.
3. The Novice
When a statistician is new to the game, you can tell by their obsession with doing things by the book and an unwillingness to take guesses. The novice will spend a week researching the best performance metric to determine if their analysis is working. With no experience or intuition, their days are spent trying multiple new methods on one data set and writing a thesis until all possible options have been exhausted. They won’t be novices for long but will never be sure of themselves until they have more experience.
4. The Reacher
Ever want to design a neural network for a simple regression problem? The Reacher has. The Reacher will use the most complicated method possible for analysis for the sake of trying something cool. The Reacher is a high risk, medium to low reward analyst who will take way too long to create a simple data product by providing analysis that nobody asked for that will only be appreciated and understood by a niche audience.
5. The one-trick pony
Some tools are so great that with a little bit of creativity, you can solve just about any problem you may encounter. However, just because anything can be solved with a nonlinear algorithm doesn’t mean that it should. The one-trick pony has learned one of these adaptive methods very well. While having a complete understanding of the method, the one-trick pony may forget to check assumptions and misapply their favorite method. A screwdriver may not be meant to hammer a nail, but you can with enough force.
6. The Philosopher
If you ever ask a statistician what one of their metrics or results means only to receive a sermon on the different ways of interpreting results, you’re dealing with a philosopher. For the philosopher, everything is a figment of their imagination, and nothing is real. You can debate the meaning of one word with the philosopher for an hour only to discover that they don’t believe in words or that the word meant something else 200 years ago, and that’s the definition they use. None of their models are correct, but some of them are ‘useful’ as long as you understand the technical risk. The philosopher is similar to the novice in that they will never feel like they have confidently answered the question.
7. The Fake Bayesian
If you ever hear an analyst mention that the difference between Frequentists and Bayesians is that Bayesians win in Vegas, but you’ve never heard them talk about priors, you’re talking to a fake Bayesian. Nate Silver is their god. These analysts will religiously read FiveThirtyEight and assume that their methods are inherently better if they just use Bayes Theorem without realizing that nearly every statistical method employs Bayes Theorem. The fake Bayesian is a sign of a small amount of experience but is more prone to overconfidence.
8. The True Bayesian
I don’t know any of these types, so I’m going to have to get back to you. I assume they’re pretty cool people.
Original. Reposted with permission.
- Cartoon: Data Scientist vs Data Engineer
- Cartoon: Machine Learning takes a vacation
- Cartoon: What Else Can AI Guess From Your Face?