The Most Important Tool for Data Engineers

And it has nothing to do with Python or SQL

By Leo Godin, Senior Data Engineer

Neon question mark in a concrete room full of graffiti.
Photo by Emily Morter on Unsplash


The best technologists solving the wrong problems are doomed to frustration and failure. Yet, we so often see great Python developers and SQL experts producing fantastic technology that adds little value to the business. In some cases, it is far worse. Rather than debatable value, the solutions drain resources and muddle business process. As data engineers, it is our responsibility to fully understand the business processes our solutions support.

As senior data engineers, we should understand the business so well, we recommend efficiencies and enhancements to how work is done. A bold statement, but I’ll die on this sword comfortably and fight anyone who disagrees. Figuratively, of course, since I don’t have a sword and am more of a lover than a fighter. The point is, we need to understand the business, and there is one essential tool that helps us get there.

Before we get to that, read this fantastic quote from Julien Kervizic that succinctly identifies the problem:

“Shaping the data by developing an understanding of the underlying data and the business process going along with it doesn’t seem nearly as important these days as the ability to move data around.”

What he’s saying here is that we are so consumed with moving data from there to here and all the cool tools we can do it with, we’ve forgotten the reasons why we do it all in the first place. Data engineers collect raw data from multiple sources and create consumable packages that enable effective use for humans and machines. Everything in between is a black box to our consumers. Why is it that we spend most of our time and energy on the black box instead of the consumable packages?

The cynical view would say it is because the black box is the fun part. While that may factor into the equation, I believe many of us just don’t understand business processes well enough to effectively spend time on improving the consumable packages. Let me be clear. It is your job and your responsibility to better understand the business. It’s not easy. In a perfect world, we’d have great documentation to rely on, but… well… you know. This is where we get to the most important tool in our data-engineering toolbox.


What is this Essential Tool?

Questions. There it is. Questions. Lots of them. Good ones. Bad ones. Embarrassing ones. All THE QUESTIONS! Is that enough emphasis for you? Do you want to go from good to great? Ask questions and fully understand the business processes you support. I can’t emphasize enough how frustrating it is talking to a data engineer who is only concerned with technology, and I am a data engineer. Imagine being a finance analyst, HR lead, or someone in sales. They need consumable packages of data but may not understand the technical jargon. They may have very little understanding of technology beyond the specific tools they use.

Therefore, it is not good enough to just ask questions. Instead, we need to ask the right questions in a language the business understands. Forget about tables, data sources and primary keys. These things come later and are often determined by even more questions to even more people. Instead, ask about what people do in their day-to-day work. Ask what the business goals are. How work flows through various systems. Ask until you fully understand the business processes the company uses. Then document it.

Write the business documentation. Sure, it is their job to do it, but you are the one who needs it. Create flowcharts including any tools the business uses. Include where people interact with the processes. Then review it with the business and ask more questions. You’ll likely find that no single person understands everything, so you’ll talk to several people and end up unifying the business processes. The documentation you write will become a valuable artifact for the business. Bam! You have just become invaluable to the company. Dare I say, you just became a senior data engineer?


Wrap it up Already

As data engineers, it is our responsibility to understand the business processes our solutions support. Without fully understanding these processes, we are doomed to frustration and failure. The imperfect world we live in is usually poorly documented, and we data engineers are the ones who need to figure it all out. By asking lots and lots of questions, we better understand the business processes our solutions support and this enables us to continually improve the impact of our work. So, get to it. Question everything!

To see how these questions lead to business documentation, read about Sue’s adventure at Krispy Krabcakes. She had tons of fun and provided a great primer on business analysis and architecture.

So You Want to be a Data Engineer
You Better Understand Architecture and Business Analysis

Bio: Leo Godin is currently working as a Senior Data Engineer at Capital Markets Gateway, who loves talking about data and career development.

Original. Reposted with permission.