Data Products
What’s a data product?
Section titled “What’s a data product?”In Masthead, a data product is a curated collection of related data assets, such as datasets or tables, that you treat as a single, logical unit. It represents a valuable, ready-to-use data resource designed for a specific purpose or audience within your organization. Think of it as packaging data for consumption, complete with metadata, ownership, cost tracking, and quality monitoring.
Key components of a data product include:
- Name: A unique identifier for the data product.
- Data Assets: The underlying data included. Currently supported types:
- datasets
- tables
- Domain: An optional category or business area the product belongs to. See Domains.
- Description: Textual information about the product’s purpose and content. This field supports Markdown formatting.
Create data product
Section titled “Create data product”You can create a Data Products to formally package and manage your key data assets.
Go to data products
Section titled “Go to data products”Open Data Products page and click Create. This opens the “Create Data Product” page.
Enter name
Section titled “Enter name”Provide a clear and descriptive Name for your data product. Masthead requires this field.
Select data assets
Section titled “Select data assets”Choose whether you are adding Datasets or Tables using the toggle buttons. Note that only Datasets is currently active.
In the “Data assets” input field, start typing the name of an existing dataset or table you want to include.
Select the desired assets from the dropdown list. You can add multiple assets. Masthead requires this field.
Assign domain
Section titled “Assign domain”Click the “Select a domain” dropdown. Choose an existing domain from the list to categorize your data product. This step is optional.
Alternatively, click “Create new domain” if the domain doesn’t exist yet. See more about Domains.
Add description
Section titled “Add description”Provide a detailed Description in the text area. Explain the purpose, content, intended use, or any other relevant context for the data product. Use the Markdown formatting if needed. This step is optional.
Click create
Section titled “Click create”After you enter all required information and add any optional details, click the Create button. To discard changes, click Cancel.
Upon successful creation, Masthead redirects you to the detail page for the newly created data product. Here you can see the configuration you’ve defined, and the aggregated insights populate shortly.
You can explore the associated costs, incidents, assets, or subscribers and dive deeper into the details where applicable. See more in Product Metrics.
Why use data products?
Section titled “Why use data products?”Creating and managing data products in Masthead provides several benefits:
- Group logically related datasets and tables under a single, meaningful entity.
- Assign clear ownership through domains and track consumers using subscribers.
- Monitor core data product metrics in an easy overview:
- Estimated compute and storage costs associated with the upstream data assets and pipelines within the product.
- Track associated pipeline and table incidents impacting the data product’s health.
- Understand who is using the data product and how often through job execution tracking.
Referenced data assets
Section titled “Referenced data assets”Masthead designs data products so that you can assign only your curated data assets for the data product. Masthead uses the lineage connection information to identify all levels of the referenced data assets upstream. The calculation of the costs and incidents metrics includes assigned and referenced data assets along with all related pipelines.
.CgT9MnFu_dxiRH.webp)
This allows to keep data product assets management easy, and at the same time track all important operational information to the data product owners and subscribers.
Product metrics
Section titled “Product metrics”Subscribers
Section titled “Subscribers”Users or service accounts that consume or interact with the data product.
Masthead identifies and displays all subscribers to make data consumption more transparent. See also usage to analyze the frequency of these interactions.
.B9DQO-Vh_xMfDq.webp)
This metric measures the number of job executions that use the product’s assets as a source. This metric offers insights about the overall consumption frequency across data assets. Masthead aggregates the usage metric visualization on the dataset level.
.C9EB8it8_1WsfgU.webp)
.BKsroHJa_htrh3.webp)
Incidents
Section titled “Incidents”Associated table and pipeline issues affecting reliability.
.BMMODoKR_Z11rKwu.webp)
Aggregated compute and storage costs. To see more details regarding the compute costs of this product, click the Compute costs panel to go to the Pipeline Costs page.
.Csju90Ei_Z4WY6m.webp)