The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability.
Dagster powers data platforms for innovative organizations all over the world
Read our users' success stories
Dagster user logo for logo-blue_origin.
Dagster user logo for logo-booking.
Dagster user logo for logo-discord.
Dagster user logo for logo-flexport.
Manage your data assets with code
Python assets
dbt-native orchestration
Task-based workflows
from dagster import asset
from pandas import DataFrame, read_html, get_dummies
from sklearn.linear_model import LinearRegression
@asset
def country_populations() -> DataFrame:
df = read_html("https://tinyurl.com/mry64ebh")[0]
df.columns = ["country", "pop2022", "pop2023", "change", "continent", "region"]
df["change"] = df["change"].str.rstrip("%").str.replace("−", "-").astype("float")
return df
@asset
def continent_change_model(country_populations: DataFrame) -> LinearRegression:
data = country_populations.dropna(subset=["change"])
return LinearRegression().fit(get_dummies(data[["continent"]]), data["change"])
@asset
def continent_stats(country_populations: DataFrame, continent_change_model: LinearRegression) -> DataFrame:
result = country_populations.groupby("continent").sum()
result["pop_change_factor"] = continent_change_model.coef_
return result
Materialize All
A single pane of glass for your data platform
Monitor execution
Debug runs
Inspect assets
Explore lineage
Monitor runs across all your jobs in one place with the run timeline view.
Dagster+
From pull request to production. Effortlessly.
The enterprise orchestration platform that puts developer experience first, with fully serverless or hybrid deployments, operational observability, data cataloging, and out-of-the-box CI/CD.
Dagster+ brings you an asset-oriented approach to go way beyond what traditional orchestration delivers.
Data teams from startups to Fortune 500 companies alike are having a blast building pipelines with Dagster
“Dagster has been instrumental in empowering our development team to deliver insights at 20x the velocity compared to the past. From Idea inception to Insight is down to 2 days vs 6+ months before.”
Gu Xie
“Dagster Insights has been an invaluable tool for our team. Being able to easily track Snowflake costs associated with our dbt models has helped us identify optimization opportunities and reduce our Snowflake costs.”
Timothée Vandeput
Data Engineer
“Dagster is the single pane of glass that our team uses to not only launch and monitor jobs, but also to surface visibility into data quality, track asset metadata and lineage, manage testing environments, and even track costs associated with Dagster and the external services that it manages.”
Zachary Romer
“Somebody magically built the thing I had been envisioning and wanted, and now it's there and I can use it.”
David Farnan-Williams
Lead Machine Learning Engineer
“Being able to visualize and test changes using branch deployments has enabled our data team to ship faster”
Aaron Fullerton
“Dagster brings software engineering best practices to a data team that supports a sprawling organization with minimal footprint.”
Emmanuel Fuentes
Integrations
Integrate with the tools you already use and deploy to your infrastructure.