Skip to content

insights table

The core insights data table that provides actionable recommendations for cost optimization and data management.

The core insights data table for Masthead’s customers consolidates recommendations for cost optimization by identifying unused and dead-end tables, along with potential actions to reduce storage costs and improve data management efficiency.

Column nameData typeValue
categorySTRINGCategory of the insight. Currently, this field supports only ‘Cost’. Future releases support additional categories.
typeSTRINGType of insight; valid values include ‘Dead end’, ‘Unused’, etc.
subtypeSTRINGSpecific subtype of the insight, such as ‘Dead end table’ or ‘Unused table’
tenantSTRINGAccount/tenant identifier
project_idSTRINGGoogle Cloud project ID
target_resourceSTRINGName of the target resource
last_updated_timeTIMESTAMPWhen this insight was last updated
overviewJSONJSON object with insight details. For details, see the insight types section.
operationsJSONArray of recommended actions to address the insight. For details, see the insight types section.
  • Leaf dead end table - Tables that aren’t used by any downstream processes and are at the end of the data lineage
  • Dead end table - Tables that have no downstream dependencies and may be candidates for removal
JSON columns schema

Overview:

{
"resource_type": "TABLE|VIEW",
"resource_name": "project.dataset.table",
"num_bytes": 123456,
"creation_time": "2025-01-01T00:00:00",
"last_modified_time": "2025-01-15T00:00:00",
"cost_30d": 10.5,
"savings_30d": 10.5
}

Operations:

[
{
"action": "drop",
"resource_type": "TABLE|VIEW",
"resource_name": "project.dataset.table"
}
]
  • Dead end pipeline - Pipelines that don’t produce any useful output and that downstream processes don’t use. This feature isn’t yet implemented.
  • Unused table - Tables that users haven’t accessed or queried within the configured time window
JSON columns schema

Overview:

{
"resource_type": "TABLE|VIEW",
"resource_name": "project.dataset.table",
"num_bytes": 123456,
"creation_time": "2025-01-01T00:00:00",
"last_modified_time": "2025-01-15T00:00:00",
"cost_30d": 10.5,
"savings_30d": 10.5
}

Operations:

[
{
"action": "drop",
"resource_type": "TABLE|VIEW",
"resource_name": "project.dataset.table"
}
]
  • Storage billing model - Datasets that you can move to a more cost-effective storage billing model based on their update patterns and data compression.
JSON columns schema

Overview:

{
"resource_type": "dataset",
"resource_name": "project.dataset",
"storage_billing_model": "LOGICAL",
"location": "us-central1",
"cost_30d": 100.0,
"savings_30d": 20.0
}

Operations:

[
{
"action": "alter",
"resource_type": "dataset",
"resource_name": "project.dataset",
"recommended_storage_billing_model": "PHYSICAL"
}
]
  • Compute model per pipeline - Recommendations for adjusting compute model assignment based on resource usage. This feature isn’t yet implemented.
-- Get all dead end tables with their details
SELECT
subtype,
project_id,
target_resource,
INT64(overview.num_bytes) AS num_bytes,
STRING(overview.resource_type) AS resource_type,
last_updated_time
FROM `masthead-prod`.DATASET_NAME.insights
WHERE category = 'Cost'
AND type = 'Dead end'
ORDER BY num_bytes DESC;
-- Get all dead end tables with their details
SELECT
subtype,
project_id,
target_resource,
INT64(overview.num_bytes) AS num_bytes,
STRING(overview.resource_type) AS resource_type,
last_updated_time
FROM `masthead-prod`.DATASET_NAME.insights
WHERE category = 'Cost'
AND type = 'Dead end'
ORDER BY num_bytes DESC;
-- Get all unused tables sorted by size
SELECT
project_id,
target_resource,
INT64(overview.num_bytes) AS num_bytes,
STRING(overview.resource_type) AS resource_type,
last_updated_time
FROM `masthead-prod`.DATASET_NAME.insights
WHERE category = 'Cost'
AND type = 'Unused'
ORDER BY num_bytes DESC;
-- Calculate total storage used by dead end and unused tables
SELECT
type,
COUNT(*) AS resource_count,
SUM(SAFE.INT64(overview.num_bytes)) AS total_bytes,
ROUND(SUM(SAFE.INT64(overview.num_bytes)) / POW(1024, 3), 2) AS total_gb,
ROUND(SUM(overview.cost_30d), 2) AS monthly_cost_usd,
ROUND(SUM(overview.savings_30d), 2) AS monthly_savings_usd
FROM `masthead-prod`.DATASET_NAME.insights
WHERE category = 'Cost'
AND subtype IN ('Dead end table', 'Unused table')
AND overview.num_bytes IS NOT NULL
GROUP BY type
ORDER BY total_bytes DESC;
-- View insights grouped by project
SELECT
project_id,
type,
COUNT(*) AS insight_count,
SUM(SAFE.INT64(overview.num_bytes)) AS total_bytes
FROM `masthead-prod`.DATASET_NAME.insights
WHERE category = 'Cost'
AND subtype IN ('Dead end table', 'Unused table')
AND overview.num_bytes IS NOT NULL
GROUP BY project_id, type
ORDER BY total_bytes DESC;
-- Get a DDL statement to alter the storage billing model for recommended datasets
SELECT
*,
CONCAT('ALTER SCHEMA `', SAFE.STRING(operations[0].resource_name), '` SET OPTIONS(storage_billing_model = "', SAFE.STRING(operations[0].recommended_storage_billing_model),'");') AS ddl
FROM `masthead-prod`.DATASET_NAME.insights
WHERE type = 'Storage billing model'
ORDER BY
FLOAT64(overview.savings_30d) DESC

To automate the analysis of cost optimization recommendations programmatically using AI agents, such as Google Antigravity or Claude Code, you can configure pre-built agent skills. These skills help agents query and interpret this insights table to recommend standard or enterprise compute models, alter dataset billing options, or identify dead-end tables.

For a list of available FinOps agent skills, see the AI agent skills guide.

  • Preview Version: This is an early preview version and the schema may change based on feedback.
  • Data Freshness: Masthead updates insights daily based on usage patterns and lineage analysis.
  • Data Retention: Insights include the most recent data from the account’s configured lookback window, which defaults to 30 days.
  • False Positives: see details in Dead-end and Unused Tables section.
  1. Review Before Action: Always manually verify insights before taking action on production data.
  2. Test Removals: Consider moving tables to a staging area before permanent deletion.
  3. Check Dependencies: Cross-reference with the lineage table to understand data dependencies.
  4. Monitor Trends: Regular review of insights can help identify data management patterns and opportunities.
  5. Coordinate with Teams: Ensure data owners are aware before removing resources they may depend on.