I had my work laptop closed, and forgot the name of "Airflow." I asked ChatGPT, and found this interaction interesting.
To create a dashboard showing the run statuses of AWS Glue jobs, here are the key AWS products you can use:
Service | Purpose |
---|---|
CloudWatch Logs | Glue sends job logs here; you can parse log events for custom insights. |
CloudWatch Metrics | Glue emits default metrics (e.g., glue.job.run.success , glue.job.run.failed ) that can be graphed. |
CloudWatch Dashboards | Native AWS dashboards to visualize metrics and logs in a single pane. |
Glue Job API / Boto3 SDK | Direct API access (e.g., get_job_runs ) lets you fetch statuses and use in custom dashboards. |
Service | Purpose |
---|---|
Amazon QuickSight | BI dashboard service that can visualize Glue job metadata via Athena or direct query. |
Amazon Athena | Query Glue job logs (if exported to S3) and visualize in QuickSight or other BI tools. |
OpenSearch + OpenSearch Dashboards | For advanced log indexing and real-time dashboards (Glue logs via CloudWatch Logs → OpenSearch). |
Service | Purpose |
---|---|
Lambda Functions | Periodically poll Glue APIs and push status data to CloudWatch or a database. |
EventBridge (CloudWatch Events) | Trigger events on Glue job state changes (success/failure) for real-time notifications or updates. |
SNS | For alerting but can also trigger workflows or Lambda updates to dashboards. |
- Simple, native AWS dashboard: CloudWatch Logs/Metrics + CloudWatch Dashboard.
- More customizable with historical analysis: Glue logs to S3 → Athena → QuickSight.
- Real-time and open source: Glue logs to OpenSearch → OpenSearch Dashboards (Kibana).
- Automated API-based updates: Lambda fetching Glue job runs → CloudWatch custom metrics → Dashboard.
If you tell me your exact goals (simple status or complex trends), I can outline a more tailored setup.
Good catch—AWS Managed Workflows for Apache Airflow (MWAA) is absolutely relevant in some scenarios, but here's why it didn’t make the initial list:
-
Airflow itself doesn’t inherently “dashboard” Glue job statuses—it orchestrates Glue jobs.
-
You can use Airflow to track and visualize Glue job status within DAG runs, but:
- Airflow UI focuses on workflow orchestration, not metric dashboards (like CloudWatch or QuickSight).
- Airflow’s Tree View/Graph View shows task status (success/failed) within DAG runs but doesn’t provide trend charts, historical aggregates, or multi-job dashboards natively.
- You orchestrate multiple Glue jobs with dependencies → Airflow DAG UI helps see which steps succeeded/failed.
- You use custom sensors/operators to query Glue APIs and create task-level views.
- You embed custom Airflow plugins (e.g., using Grafana + Airflow metrics exporter).
- You want cross-job trends, success rates, runtime metrics → CloudWatch / QuickSight is more suitable.
- You want long-term historical metrics and dashboards → Athena + QuickSight or OpenSearch is more flexible.
Summary: Airflow = orchestration monitoring (good for process view). CloudWatch/QuickSight = operational dashboards (good for success/failure rates, runtimes, trends).
If you’re using Airflow to run Glue jobs, I can suggest how to extend Airflow with better dashboarding—just say the word.