Skip to content

lineage table and procedures

The core lineage data table that consolidates and maintains data lineage relationships.

An alpha version of the core lineage data table for Masthead’s customers. This table consolidates and maintains data lineage relationships by joining edge and node information from the Marcia lineage system, providing a clean interface for tracking data flow connections, target types, and last update timestamps.

Column nameData typeValue
sourceSTRINGName of the source data object
targetSTRINGName of the target data object
target_typeSTRINGType of target data object. For details, see the Target types section below.
updated_atTIMESTAMPTimestamp of the last update for this lineage relationship
-- View all lineage relationships for an account
SELECT *
FROM masthead-prod.DATASET_NAME.lineage
ORDER BY updated_at DESC;
-- Find all tables that depend on a specific source
SELECT target, target_type, updated_at
FROM masthead-prod.DATASET_NAME.lineage
WHERE source = 'project_id.dataset_name.table_name'
AND target_type = 'TABLE';
-- Get recent lineage changes in the last 7 days
SELECT source, target, target_type, updated_at
FROM masthead-prod.DATASET_NAME.lineage
WHERE DATE(updated_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
ORDER BY updated_at DESC;

A stored procedure for recursive lineage exploration both upstream and downstream from any data object.

An alpha version of programmatic lineage exploration for Masthead’s customers. This stored procedure enables recursive traversal of data lineage relationships both upstream and downstream from a given origin reference, providing a comprehensive view of data dependencies and their hierarchical relationships within the account’s data ecosystem.

CALL `masthead-prod`.DATASET_NAME.list_lineage(origin_ref STRING)
  • origin_ref: A STRING representing the reference of the data object to start lineage exploration from
Column nameData typeDescription
originSTRINGThe reference of the starting point, for example, project_id.dataset_name.table_name
directionSTRINGUPSTREAM or DOWNSTREAM
depthINTEGERDistance from origin, starting at 1
sourceSTRINGSource object reference
targetSTRINGTarget object reference
target_typeSTRINGType of target object
updated_atTIMESTAMPLast observed relationship timestamp
-- Explore lineage for a specific table
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Get only upstream dependencies
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Then filter in your application or with a view:
-- WHERE direction = 'UPSTREAM'
-- Get only direct dependencies (depth 1)
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Then filter: WHERE depth = 1
-- Get only downstream tables
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Then filter: WHERE direction = 'DOWNSTREAM' AND target_type = 'TABLE'
-- Find the deepest dependency chain
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Then analyze: SELECT MAX(depth) as max_depth FROM results
-- Identify critical upstream dependencies
CALL `masthead-prod`.DATASET_NAME.list_lineage('project_id.dataset_name.table_name');
-- Focus on: WHERE direction = 'UPSTREAM' AND depth <= 2

To automate the traversal and analysis of data lineage programmatically using AI agents, such as Google Antigravity or Claude Code, you can configure pre-built agent skills. These skills help agents query the lineage table and call the list_lineage stored procedure to determine downstream dependencies and evaluate the blast radius of anomalies.

For details, see the AI agent skills guide.

  • Performance: Large lineage graphs may have slower query performance.
  • Cycle Detection: The recursive queries include basic cycle prevention but complex cycles may still cause issues.
  • Data Freshness: Masthead updates lineage daily.
  • Cross-Project Dependencies: External project references may have limited detail.
  • Data Retention: Lineage includes relationships only if they changed within the account’s configured lookback window, which defaults to 30 days. This ensures lineage data remains current and relevant.