Generalists Dominate Data Science
An interesting insight into why small teams generalists outperform large teams of specialists.
By Russell Jurney.
Analytics products and systems are best built by small teams of generalists. Large teams of specialists become dominated by communication overhead, and the effect of “Chinese whispers” distorts the flow of tasks and stagnates creativity. Data scientists should develop generalist skills to become more efficient members of a data science team.
Data Science: A Team Sport
Building data products takes a team covering a broad and diverse skillset. From the customer representative at one end, to the operations engineer at the other, the spectrum of roles in a product analytics team looks like this:
Large companies often fill each role with a pair of shoes, resulting in a twelve person team like the one below. The problem with this setup is that it becomes much more difficult to achieve consensus and to perform any task that spans roles. And in data science, most tasks span roles.
Adding a Chart to a Data Product
To look at a particular example, lets focus on the creation of a chart as part of a data product. To begin, a product manager creates a specification, then an interaction designer mocks up the chart, handing it off to a data scientist to fill with data (and hopefully to explore the data and find a chart worth producing), then a back-end engineer to setup an API to grab that data, a front-end web developer to create a web page using the data that matches the mock, and an experience designer to ensure the entire thing feels right and makes sense.
Charts take iteration, so this cycle of communication could happen repeatedly for each chart. You can see how communication overhead starts to predominate. A meeting of six people is a full-blown, formal meeting. It is hard to get things done in formal meetings.
In the next figure, we see how a data product team might be composed of four generalists: a data engineer, a data scientist/back-end developer, a designer who can build front ends and a product manager that can write marketing copy and cut deals. This is how a startup team would span the skill spectrum, and you can probably see how this makes them more efficient.
Revisiting the chart example, creating a chart becomes a collaboration between the product manager, a designer who codes, and a data scientist. This is the kind of ad hoc meeting of 2–3 people where “shit gets done” efficiently. This group will be more efficient than the six man group. Put another way: this small team will kick the large team’s ass.
In the big company system, sometimes the only way to get anything done efficiently is to go “guerilla generalist” and work with other generalists to cut people out of the chain. This is bad politically, and is part of what drives effective people from big companies.
We’ve shown that small teams generalists outperform large teams of specialists. In fact, generalist skills are something every data scientists should work to develop. That doesn’t mean you can’t specialize, but should combine specialization with generalization in order to develop “T-shaped skills.” The T-shaped employee is one that can lend deep expertise across projects while fulfilling multiple roles in his own.
It takes time to develop general skills, and that is why the path to becoming a data scientist is not a six month bootcamp, but a ten year journey. Along this path, remember to try to be T-Shaped!
Need help building an analytics product or platform? The Data Syndrome team of data scientists and data engineers is available to build your data products and systems as a service. We also offer training in Agile Data Science for all members of data science teams.
Bio: Russell Jurney is a Principal Consultant at Data Syndrome, a full stack data product hacker, and data science team leader.
Original. Reposted with permission.
- 5 Key Data Science Job Market Trends
- How to Make Life Easy for a Newly Hired Data Scientist
- How to build a Successful Advanced Analytics Department
Top Stories Past 30 Days