Masthead Data
  • Introduction
    • What is Masthead Data?
    • Data Anomaly Detection
    • Pipeline and Model Observability
    • Data Quality Scans
    • Metadata management: Column-level lineage and Data Dictionary
  • GET STARTED
    • Quickstart
  • Data Products
    • Data Products
    • Domains
  • COST INSIGHTS
    • Pipeline Costs
      • Compute model adjustments
    • Storage Costs
  • INTEGRATIONS
    • Analytics Hub
    • Looker
    • PagerDuty
    • Power BI
    • Slack
  • Settings
    • Account settings
  • Resources
    • Release Notes
    • Compliance & Trust Center
    • Support
Powered by GitBook
On this page
  1. COST INSIGHTS

Storage Costs

Last updated 22 days ago

Storage Costs Insights

Masthead collects and analyzes BigQuery resource metadata and usage logs to estimate the cost for the data assets and shows the recommendations to optimize the spending.

  • storage size - most recent data point,

  • storage cost - estimation based on the data storage usage over the last 30 days.

Masthead analyses each of these parameters and offers a set of recommendations optimizing the total storage bill.

Alternative Storage Billing Model for a Dataset

BigQuery provides different storage billing models (Logical and Physical) that offer the flexibility in selecting the costs based on the data properties and operation patterns. Masthead analyzes the datasets metadata and the storage usage to estimate the optimal cost for each billing model.

Google Cloud Billing calculates the storage cost in the following way:

  • logical storage cost:

active_logical_storage_size * active_logical_storage_price + long_term_logical_storage_size * long_term_logical_storage_price
  • or physical storage cost:

active_physical_storage_size + active_physical_storage_price +
long_term_physical_storage_size * long_term_physical_storage_price +
(time_travel_physical_storage_size + fail_safe_physical_storage_size) * active_physical_storage_price

Masthead analyses your data storage usage retrospectively, and with the information about alternative costs of each of the billing model creates a recommendation for the opportunities where the switch will provide consistent and confident saving outcome.

Recommendations

The dataset configuration can be adjusted by running DDL statement:

ALTER SCHEMA {project.dataset} SET OPTIONS(
    storage_billing_model = {[LOGICAL|PHYSICAL]}
);

This dataset configuration has no impact on data processing performance.

When you change a dataset's billing model, it takes 24 hours for the change to take effect.

Once you change a dataset's storage billing model, you must wait 14 days before you can change the storage billing model again.

page shows the details of the aggregated estimated metrics per dataset:

Review storage billing recommendations for the datasets on page.

to the recommended storage billing model.

Storage Cost Insights
Storage Cost Insights
Update storage billing model for a dataset
Storage costs insights and saving recommendations