Six Tips on Building a Data Science Team at a Small Company

When a company decides that they want to start leveraging their data for the first time, it can be a daunting task. Many businesses aren’t fully aware of all that goes into building a data science department. If you're the data scientist hired to make this happen, we have some tips to help you face the task head-on.

comments

By Zoe Zbar, Marketing Fellow at NYCDSA & Raul Vallejo, Director of Credit at ION

Tip #1: Break down the most important deliverables in the company.

Being the only data scientist at a company is tricky. You may be expected to be the expert at everything data or code related. A good starting point is to break down the most important deliverables in the company. Understanding these deliverables and deconstructing them until you can lay out the most important data sources and processing steps is important in understanding exactly what needs to get done for this company.

Tip #2: Utilize project planning practices

Staying organized is one of the most important aspects of building a successful team, but you don’t have to reinvent the wheel. There are many project planning practices that can help provide structure to your data processes. For example, The Data Science Hierarchy of Needs is a great resource for staying on track and organized during this planning process.

It would be nice to deliver AI solutions for your company right away, but there are many foundations that need to be set in place before that can realistically happen. The Data Science Hierarchy of Needs, and other project planning tools like it, can help you structure a sound, sustainable path to your company’s data science goals.

Tip #3: Report wins along the way

As the first data scientist, you can realistically expect that your non-technical colleagues will not understand your work and all the effort that goes into it. Therefore, it will be on you to report wins along the way towards deploying your first data model. This will ensure that your company stays up to date with your progress, and build trust in your ability to build and deliver.

For example, a reliable data flow will be the cornerstone of the productivity of any data team. It’s a foundational part of the pyramid and it will empower you to swiftly tackle a variety of problems. While the non-technical decision makers at your company will mostly be concerned with the analytical results that you eventually derive from working with a reliable data flow, setting up that flow is no small feat, and a huge step in the path to getting those results. You should take the time to report that step to your team, and make them understand its importance in the larger process.

In doing so, you’ll prove to your team that you can consistently make progress towards your goals.

Tip #4: Utilize data visualization methods

Data visualization is often overlooked. It will prove to be one of the most important tools in your data science toolkit. Good data visualization is all about practice.

An exercise to do before going to a meeting with stakeholders is plotting something and asking yourself questions that may arise among your audience. After that, make adjustments to the plots and then ask yourself questions again to see how well the graph addresses problems.

This seems simple and straightforward but it is often a forgotten step. It is important in the preparation process and your bosses will be impressed when you have solid answers to all their questions.

Think of data visualization as a tool to communicate the value of your work. It makes a huge difference for non-technical people to understand all you’re trying to convey. Ultimately, it is key for communicating and selling your work outside of your team.

Tip #5: Start your machine learning with a stupid model

When it comes to machine learning, which may not be the priority at the beginning, always start with a naive model. By “naive model” we mean a simple model, just to get something that works end-to-end.

From there, you can work on tuning and improving, which will be a much easier process once you already have something that works.

You will find the most solutions can be deployed once 80-90% of the problem has been solved. Spending time and resources trying to get that last 10% will not be the data science problem but rather a management problem.

Tip #6: Manage expectation like a magician

Many people think data science is like magic and you are the magician. It’s important for you to manage these high expectations so that you can deliver on time, and avoid drowning in your workload and dragging out deadlines.

You can manage expectations by planning ahead, staying on task, and always having the end goal in mind. Doing this and staying organized will make sure that your higher-ups are always impressed with the work you’re doing.

Starting a data science department is a big task, but as large as it is, it is also rewarding and fulfilling.

In a recent webinar, NYCDSA Bootcamp alumni, Raul Vallejo, expands more on how he built a data science department at a small company. Through this, he gives insightful advice from first-hand experience and answers audience questions.

Zoe Zbar is a Marketing Fellow at NYCDSA.

Raul Vallejo is Director of Credit at ION.

Original. Reposted with permission.

Related: