From Geometry to Governance: What Mensuration Taught Me About Data Mesh
"Where Shapes Meet Systems, Data Finds Flow."
TL;DR
Mensuration is the study of measuring space and flow, which provides a surprisingly fitting framework for understanding modern data systems like Data Mesh. Classic mathematical problems involving bees, tanks, and flow mirror the key challenges in data governance: observability, scale, and compliance. Structuring data effectively isn’t just a technical hurdle; it’s a mindset that impacts everything from domain ownership to quality control.
What is Mensuration?
Remember those high school geometry problems about how long it takes to fill a tank, or how far a bee travels to pollinate flowers? That was mensuration, the branch of mathematics dedicated to measuring shapes, space, and flow. While this may seem distant from the world of data, these principles offer valuable insights into modern data systems.
So, what if I told you that the same logic used in spatial measurements could be applied to data ecosystems? It might sound strange, but the metaphors work remarkably well.
Bees and Flowers: Mapping Data Domains
Let’s start with a classic problem: “How many bees are needed to pollinate an entire garden without overlap or gaps?”
Sounds familiar, right? This is a metaphor for Data Mesh.
Bees = Data Products
Flowers = Data Domains
Garden = Your enterprise data ecosystem
Here’s why this works. In a Data Mesh, each data product (bee) should be responsible for pollinating its own domain (flowers). Too many bees (redundant or poorly managed data products), and chaos and redundancy reign. Too few bees (shadow systems or missing data products), and gaps emerge, preventing essential insights from being gathered.
The key to solving this challenge lies in defining clear domain ownership and establishing data product governance. This ensures each data product (bee) has its designated space (domain), and there’s clarity in the ecosystem.
Within this metaphor, we can break down a few key concepts:
Lineage: Where did this data come from? Just as bees follow a clear path to flowers, data needs traceability.
Classification: What type of data is it? Is it personal, financial, or sensitive? Think of this as identifying the flowers that bees are pollinating.
Regional Compliance: Think of the garden being divided into regions, each with specific compliance regulations (like GDPR in the EU). Are you ensuring regional governance for data products?
Leaky Tanks: Observability and Data Quality
Now, let’s consider the classic flow problem: “A tank is being filled at a rate of X liters per minute, but there’s a leak. How long will it take to fill the tank?”
This is where data observability comes in.
Water = Data
Inflow = Data from producers
Leak = Quality issues, broken lineage, security gaps
Tank = The data product or system
Just like a leaking tank causes inefficiencies, unmonitored data pipelines lead to poor-quality data, introducing inconsistencies and inaccuracies. Observability in Data Mesh acts as the alert system, warning you about these “leaks” (errors, missing data, security holes) before they become big problems.
Here’s how Data Mesh addresses this:
Data Observability: Monitor the flow of data through pipelines, with alerts for issues like metric drift or pipeline failures.
Compliance Scorecards: How healthy is your data product? Like checking the integrity of a tank, data products need regular audits for quality and compliance.
Security and Classification: Are there “leaks” exposing sensitive data? Just like you wouldn’t want a tank to be accessible to unauthorized users, security gaps must be addressed.
Roles in the Mesh: Who’s Doing What?
In both geometry and Data Mesh, there’s a clear role for every participant.
Data Producers ensure data is accurate, timely, and classified correctly like ensuring the tank is filled with the right type of water.
Data Consumers rely on clear, trustworthy data with accessible lineage like bees needing clear paths to flowers.
Data Product Owners are the stewards of data products, ensuring governance, monitoring usage, optimizing design, and ensuring proper coverage much like making sure the garden is effectively pollinated by the right number of bees.
The success of Data Mesh depends on how well these roles synchronize. When everyone’s aligned, like the precise processes in mensuration, the system thrives.
Closing the Loop: If You Can Measure It…
So, what lessons from mensuration apply to Data Mesh?
Define Boundaries: Knowing where your data domain begins and ends is crucial, much like defining the limits of a shape.
Track Flow: Understanding how data moves through the system and monitoring its quality is essential, just like knowing how water flows through a tank.
Plan for Capacity: Mensuration teaches us to account for capacity, whether it’s the capacity of a tank or the capacity of your data system.
Data Mesh applies this same logic across people, pipelines, and policies. The more you measure and govern, the more scalable and trustworthy your data system becomes. If you can measure it, you can govern it. If you can observe it, you can trust it. If you can map it, you can scale it.
Next time you review your data strategy, ask yourself, are we approaching this with a calculated mindset, or are we letting it become mayhem?
Very interesting take !