For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Wire the PR-448 multipart upload endpoints into the full dataset pipeline (S3 combine → Parquet schema inference → Glue + collection dataset registration) as a 202-Accepted background pipeline, backed by a completion_status state machine with TTL-based DDB row retention.
Architecture: POST /complete returns 202 immediately. Work runs post-response via BackgroundTasks as a three-step chain (combine → infer → register) with per-step try/except rollback. Status polled via GET /multipart/{upload_id}. Terminal multipart_uploads_v1 rows expire via DynamoDB TTL (30d success/abort, 90d failure, 7d orphan). Terraform PR adds S3 AbortIncompleteMultipartUpload lifecycle rule + enables DDB TTL.
Tech Stack: