Data Products
What is a data product?
In Masthead, a data product is a curated collection of related data assets (like datasets or tables) that are treated as a single, logical unit. It represents a valuable, ready-to-use data resource designed for a specific purpose or audience within your organization. Think of it as packaging data for consumption, complete with metadata, ownership, cost tracking, and quality monitoring.
Let's have a look into key components of a data product:
Name: A unique identifier for the data product.
Data Assets: The underlying data included. Currently supported types:
datasets
tables
Domain: An optional category or business area the product belongs to. See Domains.
Description: Textual information about the product's purpose and content. Markdown formatting is supported.
Create data product
You can create a Data Products to formally package and manage your key data assets.
Go to Data Products
Open Data Products page and click Create. This will open the "Create Data Product" page.
Select data assets
Choose whether you are adding Datasets or Tables using the toggle buttons (if applicable, based on screenshot only Datasets seems active currently).
In the "Data assets" input field, start typing the name of an existing dataset (or table) you want to include.
Select the desired asset(s) from the dropdown list. You can add multiple assets. This field is required.
Assign Domain
Click the "Select a domain" dropdown. Choose an existing domain from the list to categorize your data product. This step is optional.
Alternatively, click "Create new domain" if the domain doesn't exist yet. See more about Domains.
Upon successful creation, you will be redirected to the detail page for the newly created data product. Here you can see the configuration you've defined and the aggregated insights will be populated shortly.
You can explore the associated costs, incidents, assets, or subscribers and dive deeper into the details where applicable. See more in Product Metrics.
Why use data products?
Creating and managing data products in Masthead provides several benefits:
Group logically related datasets and tables under a single, meaningful entity.
Assign clear ownership (via domains) and track consumers (subscribers).
Monitor core data product metrics in an easy overview:
Estimated compute and storage costs associated with the upstream data assets and pipelines within the product.
Track associated pipeline and table incidents impacting the data product's health.
Understand who is using the data product and how often through job execution tracking.
Referenced data assets
The data products are designed so that you can assign just your curated data assets for the data product. Masthead will use the lineage connection information to identify 2 levels of the referenced data assets upstream. Assigned and referenced data assets and all the related pipelines are included in the calculation of the costs and incidents metrics.

This allows to keep data product assets management easy, and at the same time track all important operational information to the data product owners and subscribers.
Product metrics
Subscribers
Users or service accounts that consume or interact with the data product.
We identify and show you all the subscribers to help make the consumption more transparent. See also usage to analyze the frequency of such interactions.

Usage
Metric measures a number of job executions where the product's assets are used as a source. This metric offers an insights about the overall consumption frequency across data assets. The usage metric visualization is aggregated on the dataset level.
Incidents
Associated table and pipeline issues affecting reliability.

Costs
Aggregated compute and storage costs. To see more details regarding the compute costs of this product click on the Compute costs panel to go to the Pipeline Costs page.

Last updated