F1ReplayTiming pulls its data from four distinct external sources and uses two storage backends to persist processed data. The primary data source is the FastF1 Python library, which itself wraps the official F1 timing API (livetiming.formula1.com) to supply historical session data — laps, telemetry, weather, race control messages, driver/team metadata, circuit geometry, and event schedules. For live sessions, the app connects directly to the F1 SignalR real-time stream (wss://livetiming.formula1.com/signalrcore) via WebSocket. A photo-based broadcast sync feature uses the OpenRouter AI API (specifically Gemini Flash vision model) to extract leaderboard data from screenshots. Pre-computed session data is stored either on the local filesystem or in Cloudflare R2 (S3-compatible object storage).
┌──────────────────────────────────────────┐
│ External Data Sources │
├──────────────┬───────────────┬───────────┤
│ FastF1 Lib │ F1 SignalR WS │ OpenRouter│
│ (Ergast + │ (Live Stream) │ (Vision │
│ F1 API) │ │ AI API) │
└──────┬───────┴───────┬───────┴─────┬─────┘
│ │ │
┌─────────────▼──────┐ ┌─────▼─────┐ ┌───▼────────────┐
│ f1_data.py │ │ live_ │ │ sync.py │
│ (Data Processing) │ │ signalr.py│ │ (Photo Sync) │
└────────┬───────────┘ └─────┬─────┘ └───┬────────────┘
│ │ │
▼ ▼ │
┌────────────────┐ ┌────────────────┐ │
│ process.py │ │ live_state.py │ │
│ (ETL Pipeline) │ │ (State Mgr) │ │
└────────┬───────┘ └────────┬───────┘ │
│ │ │
▼ ▼ │
┌────────────────┐ ┌────────────────┐ │
│ storage.py │ │ WebSocket to │ │
│ (Local / R2) │ │ Frontend │ │
└────────┬───────┘ └────────────────┘ │
│ │
▼ ▼
┌────────────────────────────────────────────┐
│ Frontend (Next.js) │
│ REST API + WebSocket consumers │
└────────────────────────────────────────────┘
The primary and most substantial data source is the FastF1 open-source Python library (version ≥3.8.1)1. FastF1 is an unofficial Python library that retrieves Formula 1 timing, telemetry, and session data from the official F1 live timing API and the Ergast API. The project explicitly calls it out as the foundation: "FastF1 is the original inspiration and data source for this project"2.
All data extraction happens in backend/services/f1_data.py3, which imports and uses FastF1 to load session data. The key data categories are:
| Data Type | FastF1 API Call | Data Extracted | Output File |
|---|---|---|---|
| Event schedule | fastf1.get_event_schedule(year) |
Round numbers, country, event name, location, session dates (UTC) | seasons/{year}/schedule.json |
| Session info | session.results |
Driver abbreviations, numbers, full names, team names, team colors | sessions/{year}/{round}/{type}/info.json |
| Track geometry | session.get_circuit_info(), fastest_lap.get_telemetry() |
X/Y track outline, corner positions, marshal sectors, rotation, sector boundaries | sessions/{year}/{round}/{type}/track.json |
| Lap data | session.laps |
Driver, lap number, position, lap time, sector times (S1/S2/S3), tyre compound, tyre life, pit in/out flags | sessions/{year}/{round}/{type}/laps.json |
| Race results | session.results |
Final positions, grid positions, status (finished/retired), points, team info | sessions/{year}/{round}/{type}/results.json |
| Driver positions (replay frames) | laps.get_telemetry() per driver |
X/Y GPS coordinates sampled every 0.5s, positions, gaps, intervals, tyre info, pit status, flags, race control messages, weather | sessions/{year}/{round}/{type}/replay.json |
| Telemetry per driver per lap | lap.get_telemetry() |
Speed, throttle, brake, gear, RPM, DRS, distance | sessions/{year}/{round}/{type}/telemetry/{ABBR}.json |
| Race control messages | session.race_control_messages |
Steward messages, penalties, investigations, flags, sector-level yellow flags | Embedded in replay frames |
| Weather data | session.load(weather=True) |
Air/track temperature, humidity, wind, rainfall | Embedded in replay frames |
The session loading call in _load_session() requests all four data categories at once4:
session = fastf1.get_session(year, round_num, session_type)
session.load(
telemetry=True,
laps=True,
weather=True,
messages=True,
)FastF1 uses its own internal caching layer; the app configures a persistent cache directory (FASTF1_CACHE_DIR or .fastf1-cache)5 so repeat fetches are fast. An in-memory session cache (_session_cache) also prevents redundant loads within a single process lifetime6.
Data from FastF1 is fetched in three ways:
- On-demand: When a user selects a session not yet processed,
ensure_session_data()inprocess.pytriggers the full ETL pipeline7. - Bulk pre-compute: The
precompute.pyCLI script processes sessions ahead of time8. - Auto-precompute: A background task (
auto_precompute.py) runs every 30 minutes on Fri–Mon, checking the schedule for new sessions and automatically processing them9.
For live session timing during race weekends, the app connects directly to the official Formula 1 SignalR Core endpoint10:
- HTTP negotiate endpoint:
https://livetiming.formula1.com/signalrcore/negotiate?negotiateVersion=1 - WebSocket endpoint:
wss://livetiming.formula1.com/signalrcore
This is the same real-time data feed that powers the official F1 TV and F1 app timing screens.
The client subscribes to 13 topics covering all aspects of live timing11:
| Topic | Data |
|---|---|
TimingData |
Per-driver timing (gaps, intervals, sector times) |
TimingAppData |
Extended timing (stint info, tyre data) |
TimingStats |
Session statistics (personal best laps) |
DriverList |
Driver metadata (number, abbreviation, team color) |
RaceControlMessages |
Steward decisions, penalties, flags |
TrackStatus |
Green/yellow/SC/VSC/red flag status |
WeatherData |
Temperature, humidity, wind, rainfall |
LapCount |
Current lap / total laps |
ExtrapolatedClock |
Session clock (remaining time) |
SessionInfo |
Session metadata |
SessionStatus |
Session lifecycle (started, finished, etc.) |
SessionData |
Additional session data |
Position.z |
GPS car positions (compressed with zlib) |
The LiveSignalRClient class in live_signalr.py12 handles:
- HTTP negotiation to obtain a
connectionTokenand AWS load-balancer cookie (AWSALBCORS)13 - WebSocket connection with SignalR JSON protocol handshake
- Topic subscription via a single
Subscribeinvocation - Decompression of
.ztopics (base64 + zlib deflate)14 - Handling of multiplexed
feedmessages containing multiple topic updates15 - Automatic reconnection with exponential backoff (1s → 30s max)16
- Server ping/pong keep-alive handling
Incoming SignalR messages are incremental deltas. The LiveStateManager in live_state.py17 accumulates these into a complete session state, maintaining per-driver state objects that track position, gaps, tyres, pit stops, flags, GPS coordinates, and more — producing frames in the same shape as the replay system.
For development/testing, the LiveTestReplayer (live_test_replayer.py)18 can replay .jsonStream files downloaded from the F1 static API (livetiming.formula1.com) with original timing, simulating a live session from recorded data files.
The broadcast sync feature uses AI vision to extract leaderboard data from screenshots of F1 TV broadcasts. This is powered by the OpenRouter API (https://openrouter.ai/api/v1/chat/completions) using the google/gemini-2.0-flash-001 model19.
- User uploads a photo/screenshot of the F1 timing tower
- The image is converted to JPEG (handles HEIC, PNG, etc.) and resized to max 1200px20
- The image is sent to Gemini Flash via OpenRouter with a detailed extraction prompt21
- The AI extracts: lap number, gap mode (leader/interval), and per-driver position, abbreviation, gap to leader, and tyre compound
- The extracted data is matched against pre-computed replay frames to find the closest timestamp22
Requires an OPENROUTER_API_KEY environment variable. This feature is optional — manual entry of gap times works without it23.
The compute_pit_loss.py and compute_pit_loss_v2.py scripts24 compute average pit time loss per circuit from previously processed session data. This computed data is itself derived from FastF1 data but becomes a standalone data source once computed:
- Average pit loss under green flag conditions
- Average pit loss under Safety Car
- Average pit loss under Virtual Safety Car
This data feeds the pit position prediction feature, which estimates where a driver would rejoin if they pitted now25.
Once raw data is fetched from external sources and processed, it is stored in one of two backends. These become the primary data sources for the frontend at runtime — the frontend never talks to FastF1 or the F1 API directly.
Default storage backend. JSON files are written to DATA_DIR (default: ./data)26. Data is stored as uncompressed JSON.
Optional remote storage backend, activated by setting STORAGE_MODE=r227. Uses boto3 with a custom Cloudflare endpoint. Data is stored as gzipped JSON. Requires R2_ACCOUNT_ID, R2_ACCESS_KEY_ID, and R2_SECRET_ACCESS_KEY environment variables28.
The storage.py abstraction layer29 provides a unified API (put_json, get_json, exists, list_keys) that delegates to the configured backend.
seasons/
{year}/
schedule.json ← event schedule
sessions/
{year}/
{round}/
{session_type}/
info.json ← session/driver metadata
track.json ← circuit geometry
laps.json ← lap-by-lap data
results.json ← final results
replay.json ← frame-by-frame replay data
telemetry/
{DRIVER_ABBR}.json ← per-driver telemetry
pit_loss.json ← precomputed pit loss times
| # | Source | URL / Endpoint | Type | Purpose | Required? |
|---|---|---|---|---|---|
| 1 | FastF1 (wraps F1 Timing API + Ergast) | api.formula1.com, livetiming.formula1.com |
REST / HTTP | Historical session data, telemetry, schedules | Yes (core) |
| 2 | F1 SignalR Stream | wss://livetiming.formula1.com/signalrcore |
WebSocket (SignalR) | Real-time live timing during sessions | For live feature |
| 3 | OpenRouter API (Gemini Flash) | https://openrouter.ai/api/v1/chat/completions |
REST / HTTP | AI vision for photo-based broadcast sync | Optional |
| 4 | Local filesystem / Cloudflare R2 | Local disk or {account}.r2.cloudflarestorage.com |
File I/O / S3 | Persistent storage for processed data | Yes (one of) |
- High confidence: All four data sources are clearly documented in the code with explicit URLs, import statements, and API calls. The FastF1 dependency is declared in
requirements.txtand used extensively throughoutf1_data.py. The SignalR endpoint is hardcoded. The OpenRouter integration is fully visible insync.py. The storage backends are well-abstracted instorage.pyandr2_storage.py. - No ambiguity: There are no hidden or undocumented data sources. The frontend consumes only the backend's REST API and WebSocket endpoints — it has no independent external data fetches.
Footnotes
-
backend/requirements.txt:3—fastf1>=3.8.1↩ -
README.md:255— "FastF1 is the original inspiration and data source for this project" ↩ -
backend/services/f1_data.py:1-14— imports and FastF1 cache setup ↩ -
backend/services/f1_data.py:200-206—session.load(telemetry=True, laps=True, weather=True, messages=True)↩ -
backend/services/f1_data.py:17-28— FastF1 cache directory configuration ↩ -
backend/services/f1_data.py:31-32—_session_cache: dict[str, fastf1.core.Session] = {}↩ -
backend/services/process.py:131-178—ensure_session_data()on-demand processing ↩ -
backend/precompute.py— CLI bulk pre-compute script ↩ -
backend/services/auto_precompute.py:1-7— auto-precompute background task documentation ↩ -
backend/services/live_signalr.py:36-38— SignalR URL constants ↩ -
backend/services/live_signalr.py:42-56—_TOPICSlist ↩ -
backend/services/live_signalr.py:78-98—LiveSignalRClientclass ↩ -
backend/services/live_signalr.py:188-242—_negotiate()method ↩ -
backend/services/live_signalr.py:400-409—.ztopic decompression ↩ -
backend/services/live_signalr.py:413-450—feedmessage handling ↩ -
backend/services/live_signalr.py:58-60— reconnect backoff constants ↩ -
backend/services/live_state.py:1-7— LiveStateManager documentation ↩ -
backend/services/live_test_replayer.py:1-12— replayer documentation ↩ -
backend/routers/sync.py:23-24—OPENROUTER_URLandVISION_MODELconstants ↩ -
backend/routers/sync.py:60-72—_convert_to_jpeg()image processing ↩ -
backend/routers/sync.py:26-57—EXTRACT_PROMPTfor Gemini vision ↩ -
backend/routers/sync.py:146-235—_match_frame()matching algorithm ↩ -
README.md:249-251— photo sync feature description ↩ -
backend/compute_pit_loss.py:1-12— pit loss computation documentation ↩ -
backend/routers/live.py:58-67— pit loss data loaded for live sessions ↩ -
backend/services/storage.py:31-32—_data_dir()local storage path ↩ -
backend/services/storage.py:23-24—_mode()checksSTORAGE_MODEenv var ↩ -
backend/services/storage.py:70-76— R2 credential requirements ↩ -
backend/services/storage.py:1-9— storage abstraction layer documentation ↩