Lineage
lineage table
lineage tableThe core lineage data table that consolidates and maintains data lineage relationships.
Description
An alpha version of the core lineage data table for Masthead's customers. This table consolidates and maintains data lineage relationships by joining edge and node information from the Marcia lineage system, providing a clean interface for tracking data flow connections, target types, and last update timestamps.
Schema
source
STRING
Name of the source data object
target
STRING
Name of the target data object
updated_at
TIMESTAMP
Timestamp of the last update for this lineage relationship
Target types
VIEW- logical or materialized viewTABLE- native or external tableANONYMOUS_TABLE- query run without a destination definedURI- import source or export destinationSERVICE_ACCOUNT- email address of the user who ran the jobPROJECT- Google Cloud project ID
Table Usage Examples
list_lineage procedure
list_lineage procedureA stored procedure for recursive lineage exploration both upstream and downstream from any data object.
Procedure Description
An alpha version of programmatic lineage exploration for Masthead's customers. This stored procedure enables recursive traversal of data lineage relationships both upstream and downstream from a given origin reference, providing a comprehensive view of data dependencies and their hierarchical relationships within the account's data ecosystem.
Procedure Signature
Parameters
origin_ref(STRING): The reference of the data object to start lineage exploration from
Output Schema
origin
STRING
The reference of the starting point, e.g. project_id.dataset_name.table_name
direction
STRING
UPSTREAM or DOWNSTREAM
depth
INTEGER
Distance from origin (1, 2, 3, etc.)
source
STRING
Source object reference
target
STRING
Target object reference
target_type
STRING
Type of target object
updated_at
TIMESTAMP
Last observed relationship timestamp
Procedure Usage Examples
Basic Usage
Filtering Results
Further Analysis
Limitations
Performance: Large lineage graphs may have slower query performance.
Cycle Detection: The recursive queries include basic cycle prevention but complex cycles may still cause issues.
Data Freshness: Lineage is updated daily.
Cross-Project Dependencies: External project references may have limited detail.
Data Retention: Data relationships in the lineage data are included only if they were updated within the account's configured lookback window (30 days by default). This ensures lineage data remains current and relevant.
Last updated