Dagster | Cloud-native orchestration of data pipelines

Dagster | Cloud-native orchestration of data pipelines

The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability.

Try Dagster+Learn Dagster

Dagster powers data platforms for innovative organizations all over the world

Read our users' success stories

image

Dagster user logo for logo-blue_origin.

image

Dagster user logo for logo-booking.

image

Dagster user logo for logo-discord.

image

Dagster user logo for logo-flexport.

Manage your data assets with code

Python assets

dbt-native orchestration

Task-based workflows

from dagster import asset

from pandas import DataFrame, read_html, get_dummies

from sklearn.linear_model import LinearRegression

@asset

def country_populations() -> DataFrame:

df = read_html("https://tinyurl.com/mry64ebh")[0]

df.columns = ["country", "pop2022", "pop2023", "change", "continent", "region"]

df["change"] = df["change"].str.rstrip("%").str.replace("−", "-").astype("float")

return df

@asset

def continent_change_model(country_populations: DataFrame) -> LinearRegression:

data = country_populations.dropna(subset=["change"])

return LinearRegression().fit(get_dummies(data[["continent"]]), data["change"])

@asset

def continent_stats(country_populations: DataFrame, continent_change_model: LinearRegression) -> DataFrame:

result = country_populations.groupby("continent").sum()

result["pop_change_factor"] = continent_change_model.coef_

return result

Materialize All

A single pane of glass for your data platform

Monitor execution

Debug runs

Inspect assets

Explore lineage

Monitor runs across all your jobs in one place with the run timeline view.

Dagster overview

Dagster+

From pull request to production. Effortlessly.

The enterprise orchestration platform that puts developer experience first, with fully serverless or hybrid deployments, operational observability, data cataloging, and out-of-the-box CI/CD.

Dagster+ brings you an asset-oriented approach to go way beyond what traditional orchestration delivers.

Try it free for 30 days

Data teams from startups to Fortune 500 companies alike are having a blast building pipelines with Dagster

Join the Slack community

“Dagster has been instrumental in empowering our development team to deliver insights at 20x the velocity compared to the past. From Idea inception to Insight is down to 2 days vs 6+ months before.”

Gu Xie

“Dagster Insights has been an invaluable tool for our team. Being able to easily track Snowflake costs associated with our dbt models has helped us identify optimization opportunities and reduce our Snowflake costs.”

Timothée Vandeput

Data Engineer

“Dagster is the single pane of glass that our team uses to not only launch and monitor jobs, but also to surface visibility into data quality, track asset metadata and lineage, manage testing environments, and even track costs associated with Dagster and the external services that it manages.”

Zachary Romer

“Somebody magically built the thing I had been envisioning and wanted, and now it's there and I can use it.”

David Farnan-Williams

Lead Machine Learning Engineer

“Being able to visualize and test changes using branch deployments has enabled our data team to ship faster”

Aaron Fullerton

“Dagster brings software engineering best practices to a data team that supports a sprawling organization with minimal footprint.”

Emmanuel Fuentes

Integrations

See all

Integrate with the tools you already use and deploy to your infrastructure.