Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News How Data Mesh Platforms Connect Data Producers and Consumers

How Data Mesh Platforms Connect Data Producers and Consumers

A challenge that companies often face when exploiting their data in data warehouses or data lakes is that ownership of analytical data is weak or non-existent, and quality can suffer as a result. A data mesh is an organizational paradigm shift in how companies create value from data where responsibilities go back into the hands of producers and consumers.

Matthias Patzak gave a talk about data mesh platforms at FlowCon France.

One of the biggest challenges that companies face when they want to exploit their data and become data-driven is the quality of the data they collect, as Patzak explained:

Have you ever heard the phrase "data is the new oil"? In the late 2000s, it was argued that all data should be stored because it is a valuable resource. But who trusts a 5-year-old S3 bucket when you don’t really know who stored what data and why?

Patzak argued that data is more like wine. Some data, like wine, must be consumed quickly or it will go bad. Other data, if stored and handled properly, can age very well and even increase in value and quality with age, he said.

The fundamental problem, Patzak mentioned, is that ownership of analytical data is often weak or non-existent, and quality suffers as a result. Analytical data is generated by transactional systems. However, the people who know and own these systems and the underlying processes are not responsible for the analytical application of their data, Patzak said. It is typically extracted, transformed, loaded into data warehouses or data lakes, and made available for use by a centralized, highly specialized department. These specialists often don’t have a real sense of ownership, either, he added.

A data mesh is a distributed data infrastructure that puts the responsibility for using and creating value from data back in the hands of the producers and consumers of that data, Patzak said. It eliminates the specialized data organization as a proxy and bottleneck in the communication between producers and consumers. At the heart of this distributed data infrastructure are data products that create tangible business value in their own right.

To build a data mesh, you’d create a domain-oriented architecture in which each business unit manages its data as a product, using a self-service infrastructure and tools for cataloging, sharing, and governance, as Patzak explained:

This self-service infrastructure is built by a data mesh platform and includes cloud services, data orchestration tools, and CI/CD pipelines, supported by federated governance policies for security and quality, and observability systems for monitoring.

Patzak mentioned that access is controlled by robust security mechanisms, and the entire data infrastructure is automated and maintained through Infrastructure as Code practices. Crucially, domain teams are equipped with the necessary skills through targeted enablement and training programs provided by the platform teams, ensuring that the technical setup facilitates a culture of autonomy, quality and collaboration, he added.

The benefits of a data mesh are faster implementation times and less cognitive load for producers and consumers, consistent tools, and standards for the company, Patzak concluded.

InfoQ interviewed Matthias Patzak about creating a data mesh platform.

InfoQ: What’s needed to create a data mesh platform and what benefits does a platform bring?

Matthias Patzak: From a technical point of view, everything is available to build the core services of a data mesh platform. This is just busy work. As with any platform, the challenge lies in ensuring that the platform services are accepted and used by the users. This is achieved by letting the platform users prioritize the platform backlog and by involving developers from the user teams in the development of platform services by means of job rotation.

InfoQ: What’s your advice to organizations that want to exploit their data using a data mesh?

Patzak: Don’t boil the ocean! Start small with a specific use case and pair of open minded producers and consumers and leverage the decentralised approach of data mesh. Even start before you are ready and become ready by starting. Finally, develop the platform in parallel with specific use cases.

About the Author

Rate this Article