This guide centralizes concepts needed to run a "robust" production data platform using Dagster Cloud, where robust means assets and infrastructure are:
- Fault Tolerant via replication, resource constraints, retries, parallelization, and run queues and priorities
- Observable via customizable logs and useful alerts
This guide does not cover every aspect of a production data platform. Other useful resources include:
- Testing and CICD to ensure new code does what is expected without breaking existing assets
- Project Structure to build a code base that can scale across teams and dependencies [Todo: Link to guide]
- Data Expectations to ensure the data flowing through your pipelines is valid and meets your expectations [Todo: refresh guide for assets, add section on conditional behavior]