lineage table and procedures
lineage table
Section titled “lineage table”The core lineage data table that consolidates and maintains data lineage relationships.
Description
Section titled “Description”An alpha version of the core lineage data table for Masthead’s customers. This table consolidates and maintains data lineage relationships by joining edge and node information from the Marcia lineage system, providing a clean interface for tracking data flow connections, target types, and last update timestamps.
Schema
Section titled “Schema”| Column name | Data type | Value |
|---|---|---|
source | STRING | Name of the source data object |
target | STRING | Name of the target data object |
target_type | STRING | Type of target data object. For details, see the Target types section below. |
updated_at | TIMESTAMP | Timestamp of the last update for this lineage relationship |
Target types
Section titled “Target types”VIEW- logical or materialized viewTABLE- native or external tableANONYMOUS_TABLE- query run without a destination definedURI- import source or export destinationSERVICE_ACCOUNT- email address of the user who ran the jobPROJECT- Google Cloud project ID
Table usage examples
Section titled “Table usage examples”-- View all lineage relationships for an accountSELECT *FROM masthead-prod.DATASET_NAME.lineageORDER BY updated_at DESC;-- Find all tables that depend on a specific sourceSELECT target, target_type, updated_atFROM masthead-prod.DATASET_NAME.lineageWHERE source = 'project_id.dataset_name.table_name'AND target_type = 'TABLE';-- Get recent lineage changes in the last 7 daysSELECT source, target, target_type, updated_atFROM masthead-prod.DATASET_NAME.lineageWHERE DATE(updated_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)ORDER BY updated_at DESC;list_lineage procedure
Section titled “list_lineage procedure”A stored procedure for recursive lineage exploration both upstream and downstream from any data object.
Procedure description
Section titled “Procedure description”An alpha version of programmatic lineage exploration for Masthead’s customers. This stored procedure enables recursive traversal of data lineage relationships both upstream and downstream from a given origin reference, providing a comprehensive view of data dependencies and their hierarchical relationships within the account’s data ecosystem.
Procedure signature
Section titled “Procedure signature”CALL `masthead-prod`.DATASET_NAME.list_lineage(origin_ref STRING)Parameters
Section titled “Parameters”origin_ref: ASTRINGrepresenting the reference of the data object to start lineage exploration from
Output schema
Section titled “Output schema”| Column name | Data type | Description |
|---|---|---|
origin | STRING | The reference of the starting point, for example, project_id.dataset_name.table_name |
direction | STRING | UPSTREAM or DOWNSTREAM |
depth | INTEGER | Distance from origin, starting at 1 |
source | STRING | Source object reference |
target | STRING | Target object reference |
target_type | STRING | Type of target object |
updated_at | TIMESTAMP | Last observed relationship timestamp |
Procedure usage examples
Section titled “Procedure usage examples”Basic usage
Section titled “Basic usage”-- Explore lineage for a specific tableCALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');Filtering results
Section titled “Filtering results”-- Get only upstream dependenciesCALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');-- Then filter in your application or with a view:-- WHERE direction = 'UPSTREAM'-- Get only direct dependencies (depth 1)CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');-- Then filter: WHERE depth = 1-- Get only downstream tablesCALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');-- Then filter: WHERE direction = 'DOWNSTREAM' AND target_type = 'TABLE'Further analysis
Section titled “Further analysis”-- Find the deepest dependency chainCALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');-- Then analyze: SELECT MAX(depth) as max_depth FROM results-- Identify critical upstream dependenciesCALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');-- Focus on: WHERE direction = 'UPSTREAM' AND depth <= 2AI agent skills
Section titled “AI agent skills”To automate the traversal and analysis of data lineage programmatically using AI agents, such as Google Antigravity or Claude Code, you can configure pre-built agent skills. These skills help agents query the lineage table and call the list_lineage stored procedure to determine downstream dependencies and evaluate the blast radius of anomalies.
For details, see the AI agent skills guide.
Limitations
Section titled “Limitations”- Performance: Large lineage graphs may have slower query performance.
- Cycle Detection: The recursive queries include basic cycle prevention but complex cycles may still cause issues.
- Data Freshness: Masthead updates lineage daily.
- Cross-Project Dependencies: External project references may have limited detail.
- Data Retention: Lineage includes relationships only if they changed within the account’s configured lookback window, which defaults to 30 days. This ensures lineage data remains current and relevant.