Don't call it Data Governance
Idea: What if we stop calling it Data Governance?
Data Governance elicits feelings of boredom and numbness in my brain. Governance rhymes with Compliance. When is the last time you got excited about Compliance? Yeah, I didn’t think so.
What is Data Governance? Let’s start with what it is not. It is not the tech side of things. We got our cloud infrastructure, raw data sources, transformation pipelines, specific data models, and dashboards. We got wrangled data sets for machine learning; we got regression and deep learning models. There’s a lot of SQL, Python, R code moving all the 0s and 1s around. That’s the tech side.
The compliment to the tech side is the context. It’s the subject matter, the meaning, the business logic, the why, the how.
Imagine we are trying to calculate the lifetime value (LTV) of a healthcare system patient. I know, I am crazy to apply a standard marketing metric to healthcare, indulge me. We might have the best data scientists east of the Mississippi; they can build models in their sleep. But our brilliant developers have no idea about the ins and out of healthcare patient revenue. Spoiler alert, it’s loaded with complexity.
They don’t know that some patients' LTV is based on their Medicare Advantage risk-adjusted capitated payments (fee-for-performance). For those patients, we get revenue based on membership, not on services provided. Then other patients just come in when they need a flu shot. The healthcare system gets paid every time they visit. (fee-for-service). And then there’re denials and write-offs to factor in, healthcare revenue cycle is a beast.
To get to our patients’ LTV, we need to understand all these subtleties and carefully define the metric calculation for different patient tranches. We need to work together with the people that know the little details inside and out. We need to write it down; we don’t want anyone else starting from scratch (templates, business glossary). We need to check that our business definition matches our code (data validation, data integrity). We need someone on the business side to be our partner; they’ll help us validate, they’ll tell us what’s working and what’s useless, they’ll answer our questions, even the stupid ones (data stewards). We need a way to keep track of all the code, data models, reports, and dashboards that are related to this metric. (data lineage, data dictionary).
We need to understand, organize, and keep track of the context that sits on top of our technology. This is data governance. But when I describe it above, it doesn’t sound dull or scary. It’s all the other stuff that around your code that makes what we are doing valuable to the business. It’s the fun stuff - it’s where the impact happens.
So what if we stopped calling it Data Governance and started calling it Data Context instead. My eyes would glaze over less.
#datacontext by #datapavel