Data Quality & Governance

What is Data Lineage?

Data Lineage is The tracking of data's origins, transformations, and movements through your systems.

Definition

Data lineage documents where data came from, how it was transformed, and where it went. It's the audit trail for your data infrastructure. When a number in a dashboard looks wrong, lineage helps you trace it back to the source. Lineage tools visualize data flows and flag when changes might break downstream dependencies.

Why It Matters

Without lineage, debugging data issues is detective work. A bad number in a report could come from any of dozens of sources or transformations. Lineage makes the data supply chain visible. It's especially important for regulated industries where you need to prove where data came from.

Example

A finance team questions a revenue number in a dashboard. Data lineage shows the number originated in Stripe, was transformed by a dbt model, and aggregated in Looker. They trace the issue to a currency conversion bug in the dbt model.

Tools for Data Lineage

Find the Right Data Lineage Tool

Not sure which tool fits your needs? Check out our curated recommendations:

Related Terms