Skip to content

Instantly share code, notes, and snippets.

@jhannah
Created July 19, 2025 20:06
Show Gist options
  • Save jhannah/abc713847cf1d81a8467c015cc16091f to your computer and use it in GitHub Desktop.
Save jhannah/abc713847cf1d81a8467c015cc16091f to your computer and use it in GitHub Desktop.
ChatGPT 4o responses regarding Airflow

I had my work laptop closed, and forgot the name of "Airflow." I asked ChatGPT, and found this interaction interesting.

List the AWS products that can be used to create a dashboard of run statuses of AWS Glue jobs.

To create a dashboard showing the run statuses of AWS Glue jobs, here are the key AWS products you can use:

Core AWS Services

Service Purpose
CloudWatch Logs Glue sends job logs here; you can parse log events for custom insights.
CloudWatch Metrics Glue emits default metrics (e.g., glue.job.run.success, glue.job.run.failed) that can be graphed.
CloudWatch Dashboards Native AWS dashboards to visualize metrics and logs in a single pane.
Glue Job API / Boto3 SDK Direct API access (e.g., get_job_runs) lets you fetch statuses and use in custom dashboards.

🖥️ Optional Visualization Tools

Service Purpose
Amazon QuickSight BI dashboard service that can visualize Glue job metadata via Athena or direct query.
Amazon Athena Query Glue job logs (if exported to S3) and visualize in QuickSight or other BI tools.
OpenSearch + OpenSearch Dashboards For advanced log indexing and real-time dashboards (Glue logs via CloudWatch Logs → OpenSearch).

🛠️ Automation & Enhancement

Service Purpose
Lambda Functions Periodically poll Glue APIs and push status data to CloudWatch or a database.
EventBridge (CloudWatch Events) Trigger events on Glue job state changes (success/failure) for real-time notifications or updates.
SNS For alerting but can also trigger workflows or Lambda updates to dashboards.

Most Common Combinations

  • Simple, native AWS dashboard: CloudWatch Logs/Metrics + CloudWatch Dashboard.
  • More customizable with historical analysis: Glue logs to S3 → Athena → QuickSight.
  • Real-time and open source: Glue logs to OpenSearch → OpenSearch Dashboards (Kibana).
  • Automated API-based updates: Lambda fetching Glue job runs → CloudWatch custom metrics → Dashboard.

If you tell me your exact goals (simple status or complex trends), I can outline a more tailored setup.

Why didn't you mention AWS Airflow?

Good catch—AWS Managed Workflows for Apache Airflow (MWAA) is absolutely relevant in some scenarios, but here's why it didn’t make the initial list:

  • Airflow itself doesn’t inherently “dashboard” Glue job statuses—it orchestrates Glue jobs.

  • You can use Airflow to track and visualize Glue job status within DAG runs, but:

    • Airflow UI focuses on workflow orchestration, not metric dashboards (like CloudWatch or QuickSight).
    • Airflow’s Tree View/Graph View shows task status (success/failed) within DAG runs but doesn’t provide trend charts, historical aggregates, or multi-job dashboards natively.

When Airflow is relevant:

  • You orchestrate multiple Glue jobs with dependencies → Airflow DAG UI helps see which steps succeeded/failed.
  • You use custom sensors/operators to query Glue APIs and create task-level views.
  • You embed custom Airflow plugins (e.g., using Grafana + Airflow metrics exporter).

When Airflow isn't enough alone:

  • You want cross-job trends, success rates, runtime metrics → CloudWatch / QuickSight is more suitable.
  • You want long-term historical metrics and dashboards → Athena + QuickSight or OpenSearch is more flexible.

Summary: Airflow = orchestration monitoring (good for process view). CloudWatch/QuickSight = operational dashboards (good for success/failure rates, runtimes, trends).

If you’re using Airflow to run Glue jobs, I can suggest how to extend Airflow with better dashboarding—just say the word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment