Plan: Stop Committing Generated ICS/JSON Files to the Repo

Current State

CI workflow commits cities/*/events.json, cities.json, and version.txt to git after every build
Individual .ics files from scrapers/feeds are also committed
The app does not serve these files to users — it queries Supabase directly
The load-events edge function fetches events.json from raw.githubusercontent.com to upsert into Supabase
cities.json and version.txt ARE used by the frontend at load time (via GitHub Pages)

The Core Tension

The issue proposes GitHub Releases as a data archive, but there's a critical dependency: load-events fetches events.json from the raw git content. If we stop committing those files to main, we need an alternative path for getting data into Supabase.

Step-by-Step Plan

Phase 1: Change the data ingestion path (must happen FIRST)

Option A — Direct upload from CI to Supabase (recommended, simplest)

Instead of: CI commits → git push → edge function fetches from GitHub → upserts to Supabase
Do: CI generates events.json → CI calls load-events with the data inline (POST body) or CI directly calls the Supabase REST API
This eliminates the round-trip through git entirely
The load-events edge function already accepts a city list in the POST body; we'd extend it to accept event data too, OR we bypass it and use the Supabase REST API directly from CI

Option B — Upload to GitHub Release, fetch from there

CI creates a release tarball, load-events fetches from the latest release instead of raw git
More complex (release API, finding latest asset URL) with no real benefit over Option A

Recommendation: Option A. Modify the workflow to POST event data directly to the edge function or Supabase API, removing the git-as-transport pattern entirely.

Phase 2: Stop tracking generated data files on `main`

Add to .gitignore:

cities/*/*.ics
cities/*/events.json
cities/*/combined.ics

git rm --cached to untrack existing files (keeps them locally):
```
git rm --cached cities/*/*.ics cities/*/events.json
```
Keep committing cities.json and version.txt — these are small, stable, and consumed by the frontend from GitHub Pages

Phase 3: Archive data to an orphan `data` branch

Create an orphan data branch (no shared history with main) for analysis, debugging, and auditing:

- name: Archive data to data branch
  run: |
    # Save generated files
    mkdir -p /tmp/build-data
    cp -r cities/*/events.json cities/*/*.ics report.json /tmp/build-data/ 2>/dev/null || true

    # Switch to data branch
    git fetch origin data || git checkout --orphan data
    git checkout data

    # Copy in new data
    cp -r /tmp/build-data/* .
    git add cities/*/events.json cities/*/*.ics report.json
    git commit -m "Build $(date +%Y%m%d-%H%M%S)"
    git push origin data

    # Return to main
    git checkout main

Why a data branch:

Diffing is trivial: git diff data~1..data -- cities/santarosa/events.json
No noise on main — data branch never merges into main
Full git history of generated data, searchable with git log
Timezone tests / report.json auditing can check out data branch or fetch individual files via raw.githubusercontent.com/.../data/...
Per-source event count regressions are visible in git diff

Phase 4: Optional — GitHub Releases for long-term snapshots

For periodic archival with pruning (complements the data branch):

- name: Create data release
  run: |
    TAG="build-$(date +%Y%m%d-%H%M%S)"
    tar czf data.tar.gz cities/*/events.json cities/*/*.ics
    gh release create "$TAG" data.tar.gz \
      --title "Build $TAG" \
      --notes "Auto-generated calendar data"

- name: Prune old releases (keep last 30)
  run: |
    gh release list --limit 100 --json tagName,createdAt \
      | jq -r '.[30:][].tagName' \
      | xargs -I{} gh release delete {} --yes --cleanup-tag

Phase 5: Update `load-events` edge function

Modify supabase/functions/load-events/index.ts to accept event data directly in the POST body instead of fetching from GitHub:

// Current: fetches from raw.githubusercontent.com
// New: accepts events array in request body per city
// Fallback: can still fetch from data branch URL if needed

The workflow step changes from triggering a fetch to posting data directly.

Phase 6: Update tests

tests/test_timezone_pipeline.py already uses synthetic ICS data — no changes needed
Browser tests (test.html) query Supabase — no changes needed
Regression tests use the live app — no changes needed
If any future test needs real build data, it can check out the data branch

Phase 7: Clean up workflow

Remove from generate-calendar.yml:

The git add/commit/push block (lines ~441-478) that commits generated files to main
The rebase-conflict retry logic (no longer needed)
Keep the git push for cities.json and version.txt only

What Gets Simpler

git pull stops conflicting — no more "always expect push to fail" pattern
Git history becomes meaningful — only real code changes on main
Repo size stops growing on main — no more ~800 ICS files per build
CI is faster — no rebase/retry loop

What Needs Care

load-events edge function must be updated before we stop committing (Phase 1 before Phase 2)
Edge function JWT gotcha still applies if we redeploy load-events
Rollback path: if Supabase is down, we can't recover from main anymore — but we CAN recover from the data branch or latest GitHub Release
cities.json and version.txt should still be committed to main (small, used by frontend)
data branch size will grow over time — consider periodic squashing or starting a fresh orphan if it gets unwieldy

Execution Order

Modify load-events to accept inline data (or switch to direct Supabase REST calls from CI)
Test the new ingestion path on one city
Create the orphan data branch
Add the data-branch archive step to the workflow
Update .gitignore and git rm --cached
Remove the old commit/push logic from the workflow
Optionally add GitHub Releases for long-term snapshots
Test a full CI run end-to-end

judell/plan-issue-41.md

Select an option

No results found

Select an option

No results found

Plan: Stop Committing Generated ICS/JSON Files to the Repo

Current State

The Core Tension

Step-by-Step Plan

Phase 1: Change the data ingestion path (must happen FIRST)

Phase 2: Stop tracking generated data files on `main`

Phase 3: Archive data to an orphan `data` branch

Phase 4: Optional — GitHub Releases for long-term snapshots

Phase 5: Update `load-events` edge function

Phase 6: Update tests

Phase 7: Clean up workflow

What Gets Simpler

What Needs Care

Execution Order

judell/plan-issue-41.md

Plan: Stop Committing Generated ICS/JSON Files to the Repo

Current State

The Core Tension

Step-by-Step Plan

Phase 1: Change the data ingestion path (must happen FIRST)

Phase 2: Stop tracking generated data files on main

Phase 3: Archive data to an orphan data branch

Phase 4: Optional — GitHub Releases for long-term snapshots

Phase 5: Update load-events edge function

Phase 6: Update tests

Phase 7: Clean up workflow

What Gets Simpler

What Needs Care

Execution Order

Phase 2: Stop tracking generated data files on `main`

Phase 3: Archive data to an orphan `data` branch

Phase 5: Update `load-events` edge function