
Your Platform for AI and Data Pipelines
Dagster is an open-source data orchestration platform built for modern data and AI engineering teams. It takes an asset-centric approach to pipeline development, treating tables, files, ML models, and notebooks as first-class citizens with built-in observability, lineage tracking, and data quality validation.
Models pipelines around actual data assets produced rather than just tasks, enabling clearer lineage and dependency management.
Event-driven, condition-based triggers that go beyond cron scheduling to reduce redundant computations.
Unified, searchable catalog of all data assets, workflows, and metadata for cross-team discoverability.
Embeds validation, freshness checks, and automated testing directly into pipelines to catch issues early.
Traces data lineage, captures operational metadata, monitors run health, and surfaces cost per operation.
Exposes per-operation compute costs and resource utilization for optimizing cloud spending.
First-class integrations with dbt, Snowflake, Databricks, Spark, AWS, and Azure.
Build modular, testable extract-transform-load pipelines with full lineage tracking for warehouses like Snowflake or BigQuery.
Orchestrate dbt model runs with observability, scheduling, and dependency management on top of dbt transformations.
Automate data ingestion, feature engineering, model training, and retraining triggers with asset-modeled ML artifacts.
Embed automated data validation, freshness checks, and anomaly detection into pipelines.
Best orchestration platform for data teams building a production data platform — the most opinionated and powerful approach to managing complex pipeline dependencies and data quality
Best for data engineering teams wanting unified asset-based orchestration — the deepest technical integration where dbt models are true first-class objects, not just CLI commands.
Migrate from legacy Airflow or Informatica to a modern, asset-aware pipeline foundation.

Open-source, AI-first business automation