Data Mesh Architecture: Reimagining Data Management

The objective of data mesh is to establish coherence between data coming from different domains across an enterprise. The domains are handled autonomously to eliminate the challenges of data availability and accessibility for cross-functional teams.



Data Mesh Architecture: Reimagining Data Management
Connect the dots vector created by GarryKillian - www.freepik.com

 

Introduction

 

Data is viewed as a driver of business innovation. Enterprises are thus constantly exploring the potential of data to make business processes more intuitive and bring a hyper-personalized experience to their customers. As driven enterprises are seeing success in today’s digital world, investments in the data analytics market are set to reach $103 billion by 2023. As businesses are eagerly trying to fetch more valuable and actionable insights from their data, the increase in the amount of data and data sources is also rapidly growing and expanding. It is getting increasingly complex to continue the data management strategy of integrating data from disparate sources to a centralized location i.e., (a data lake or data warehouse) because centralized data in a data warehouse or data lake requires analysis by a specialized team. 

Data Mesh is a new strategy of decentralizing the data and bringing ownership to each business domain such as sales or customer support.

The objective of data mesh is to establish coherence between data coming from different domains across an enterprise. The domains are handled autonomously to eliminate the challenges of data availability and accessibility for cross-functional teams.

 

Problems Data Mesh Will Fix

 

Data Mesh addresses the following concerns that appear as glaring gaps in the traditional approach to big data management:

 

Supports Scalability

 

Market experts state that the traditional approach to data management including data warehouse or data lake models does not scale well. As the quantity of data increases, the complexity involved in data management also increases. In a traditional data lake architecture, data coming from varied sources becomes difficult for data consumers down the line to interpret. The consumers then must go back to the data producer to try and understand the data. With more integration of multiple platforms and lack of structure and ownership of data over time becomes a big obstacle in the data lake architecture. 

The data mesh architecture advocates data ownership in each team to create, handle and store the data. This helps other domains to avoid the bottleneck of a single, central enterprise-wide data warehouse or data lake. This decoupling of centralized data makes enterprise-grade scaling a possibility when teams serve their own data needs at speed and scale bringing innovation across the domains quickly.

 

Improved data quality 

 

A central pipeline gives teams less control over quality over increasing volumes of data but as data mesh architecture allows each team to use their own warehouses and lakes, and create and manage their domain data, thus teams have more incentive and ownership to ensure the quality of their data products before further distribution. This architecture brings more accountability and collaboration among different teams within an enterprise.

 

Focus more on organizational change

 

The centralized monolithic data reserve also provided an architecture to access data across multiple technologies and platforms, but it is more is technology-centric, while a data mesh focuses on organizational change. The data mesh ecosystem builds knowledge into domain teams, encouraging domain teams to deliver optimal business value within their own areas of expertise. This type of ecosystem breaks the common myth that one needs to centralize data for it to be useful. As data is centralized from varied sources, the meaning of data from the source is sometimes modified. This issue can be resolved with the data mesh, as domain teams view data-as-a-product and handle their own data pipelines. They are further able to supply data products themselves and offer them centrally.

 
Let’s look at a few key considerations enterprises should make before adopting the data mesh architecture

 

Size and business requirements

 

As the volumes and types of data grow, the data teams become overwhelmed, and businesses get less and less value from their data investments. Hence, data mesh architecture is ideal for larger enterprises with data flowing from multiple sources which are diverse and mutable. Business initiatives should be aligned closely with domain teams to fetch valuable insights from the domain-specific data. Such alignment helps domain teams to create quality that delivers real business value.

 

Data management and required expertise

 

The strategy of each domain handling data also requires enterprise-wide coordination and governance. Modern tools can help enterprises to get started with data management strategies. The selection of tools still requires thorough oversight from experts.

Platforms like Cuelogic, Data Product Platform, IBM etc. offer services for implementing data mesh architecture. These platforms like in traditional data management architecture integrate data from all sources and make it into any number of data products and create a data mesh. Data management platforms offer secure distribution of data among any number of domains and provide data quality control, privacy, and access, at any level of federation.

Such platforms are also useful in guiding enterprises in the journey of transitioning to a new way of handling their data and shielding them from the data complexities found in the underlying systems.
 
 

Conclusion

 

Data mesh architecture brings a paradigm shift to the existing data management architecture. It provides the ability to handle a diverse and huge volume of data which makes it a better approach compared to other data architectures. The domain-specific structures in the data mesh help to generate valuable insights from mounting chunks of data. The ownership of data with teams translates into greater scope for data experimentation and innovation. Netflix, embraced data mesh architecture to integrate and manage data across hundreds of different data stores. Refer to this Youtube video to understand Netflix Data Mesh. This new way of creating a network of domain teams who own their data and handle it as a product helps get more value out of their data operations.

 
 
Yash Mehta is an IoT and Big Data enthusiast who has contributed many articles on IDG, IEEE, Entrepreneur, etc. publications. He co-developed platforms like Getlua that lets users easily merge multiple files together. He also founded a research platform that generates actionable insights from experts.