Data Products
Last updated
Last updated
In Masthead, a data product is a curated collection of related data assets (like datasets or tables) that are treated as a single, logical unit. It represents a valuable, ready-to-use data resource designed for a specific purpose or audience within your organization. Think of it as packaging data for consumption, complete with metadata, ownership, cost tracking, and quality monitoring.
Let's have a look into key components of a data product:
Name: A unique identifier for the data product.
Data Assets: The underlying data included. Currently supported types:
datasets
tables
Domain: An optional category or business area the product belongs to. See .
Description: Textual information about the product's purpose and content. Markdown formatting is supported.
You can create a Data Products to formally package and manage your key data assets.
Choose whether you are adding Datasets or Tables using the toggle buttons (if applicable, based on screenshot only Datasets seems active currently).
In the "Data assets" input field, start typing the name of an existing dataset (or table) you want to include.
Select the desired asset(s) from the dropdown list. You can add multiple assets. This field is required.
Upon successful creation, you will be redirected to the detail page for the newly created data product. Here you can see the configuration you've defined and the aggregated insights will be populated shortly.
You can explore the associated costs, incidents, assets, or subscribers and dive deeper into the details where applicable. See more in .
Creating and managing data products in Masthead provides several benefits:
Group logically related datasets and tables under a single, meaningful entity.
Monitor core data product metrics in an easy overview:
Estimated compute and storage costs associated with the upstream data assets and pipelines within the product.
Track associated pipeline and table incidents impacting the data product's health.
Understand who is using the data product and how often through job execution tracking.
This allows to keep data product assets management easy, and at the same time track all important operational information to the data product owners and subscribers.
Users or service accounts that consume or interact with the data product.
Metric measures a number of job executions where the product's assets are used as a source. This metric offers an insights about the overall consumption frequency across data assets. The usage metric visualization is aggregated on the dataset level.
Associated table and pipeline issues affecting reliability.
Assign clear ownership (via ) and track consumers ().
The data products are designed so that you can assign just your curated data assets for the data product. Masthead will use the lineage connection information to identify 2 levels of the referenced data assets upstream. Assigned and referenced data assets and all the related pipelines are included in the calculation of the and metrics.
We identify and show you all the subscribers to help make the consumption more transparent. See also to analyze the frequency of such interactions.
Aggregated compute and storage costs. To see more details regarding the compute costs of this product click on the Compute costs panel to go to the page.