The ‘WHY’ of Data Lineage – The MRI of Trusted Data.

This week, I'll like to expand our conversation on the 'Why' of some of the Data Governance activities. Painting some colors on how it all fits together to realize the common purpose of creating a trusted, governed data environment. This week our focus is on the why of Data Lineage.

The moment an organization kicks off its Data Governance adoption, you can be guaranteed they'll start having a conversation around a few of the governance activities. i.e. Ownership, Metadata (Data Dictionaries/Business Glossary) Lineage, Data Quality Rules Monitoring, and Issue Management.

However, I've come to realize that most kickoff these activities simply without knowing the why or the purpose of such in helping realize their desired governance success. The race to start building business glossaries and carving out Lineage like everyone else often deprive most of the opportunity to rightfully engage these activities to fuel and optimize their data governance adoption. The sad truth is that many simply have no idea why those activities are important to the realization of a trusted data environment.

Could this simply be one of the reasons why the business case and narratives around ROI of Data Governance fail to get senior executives buy-in in most organizations? Could this be the reason why a lot of organizations would rather continue to invest more in Data Cleansing, fixing projects to realize their ambitions in AI/ML, and other Analytics innovations?

The reality is that a lot of organizations simply can't articulate the why of their Data Governance activities because no one has painted the true picture of how we engage some of the different activities to realize the organization's strategic goal in a compelling way.

I think our narratives on Data Governance over the years have done little justice to the discipline. Hence, the poor understanding by a lot that Data Governance is all about creating Lineage and Building Dictionaries and Data Glossaries. This, of course, has created a good market for a lot of product developers creating automated lineage tools and metadata hubs to simply label them as The Data Governance Magic Tool.

This has been quite misleading over the years and we simply need to start changing the narratives on why we do what we do in our journey towards actualizing a governed data environment.

Simply put, Data Governance is about cultural transformation to create trust in our data asset. Every governance activity has a purpose to help realized the needed trust in our data and those purposes need to be clearly articulated to our stakeholders as we go through the journey.

To this effect, I'll like to open up the conversation around Data Lineage.

What is Data Lineage?

Data Lineage is simply a graphical representation of the journey data makes through its value chain. It is simply the 'GPS' of how data journeys its way from creation to its consumption highlighting the several stops and engagement along the way.

So, what is the 'WHY' of Data Lineage?

I think it makes sense for us to step back and remind ourselves what we're trying to achieve with Data Governance before answering this question.

Simply put, the plight of Data Governance adoption is to address the root cause of the Poor-Quality Data we have in our organizations by instilling the needed 'Standards of Care' to promote good quality data. It is a cultural transformation of ethics and building accountability of due diligence by all data citizens (both producers & consumers).

Poor quality data continues to cost every organization millions of dollars every year at an astronomical rate like never before and there seems to be no end in sight to give ourselves a break from continuous investment in fixing the same data quality issues over and over again.

Data Governance attempts to instill needed cultural transformation of ethical discipline on how we create, engage, access, and consume our data through proper accountability and stewardship. Thereby preventing and reducing the introduction of poor-quality data to our ecosystem.

To this effect, one of the primary key activities to kickoff Data Governance is Data Lineage. With Lineage, our primary goal is to understand the footprint of our data asset. We do this by mapping out Data Lineage.

There are 2 types of Data Lineage often discussed – Technical Lineage & Functional Lineage.

The Technical Lineage is engaged as part of the technical activity to move data from one platform to another in form of ETL while the Functional Lineage is a business view of the flow or journey of data along its value chain.

For our discussion here, I'll like us to focus on the Functional Lineage. This is what provides the needed value-add for our business stakeholders.

Here are a few of the reasons we engage functional lineage in our Data Governance adoption:

·  To provide a visual representation of the journey of our data from creation to consumption, thereby providing needed transparency around our data asset.

·  To understand the number of engagement points of the data from its origination to consumption and validate activities along the path of the same data.

·  To provide insight into the impact of each engagement along the same data value chain

·  To identify points of potential risks and opportunities along the value chain.

·  To recognize the players along the way. i.e Owners, providers, and Consumers.

·  To confirm the presence or lack of controls or leakages between each point as the data journeys hop to hop along its journey. The more hop a data have along its value chain, the more tendency to have leakages and introduction of compromised quality.

·  To validate the presence or lack of formalized handshake between each actor(provider/consumer) along the value chain.

·  To quickly locate causes of data issues needing immediate remediation.

In a nutshell, Data Lineage is simply the MRI of exposing the health of our data. It's the 'blood work' that illuminates areas of potential opportunities for governance and stewardship.

When rightfully engaged in your Data Governance adoption journey, it gets you closer to realizing the desired value of Data Governance. It positions you for the needed cultural transformation to realize the optimal ROI from your data asset. It provides needed visibility for you to build governance around the area of opportunity in your data.

For more detail and practical help to activate effective Data Governance & Stewardship around your data value chain. Book a Free Call with me to discuss your challenges and we can explore simple strategies to actualize your governance success.

https://calendly.com/lara-gureje/30min

Lara Gureje