Skip to content

Instantly share code, notes, and snippets.

@fearnworks
Last active March 27, 2025 13:37
Show Gist options
  • Save fearnworks/aae174f32d6d3218389fa29f21c2fecf to your computer and use it in GitHub Desktop.
Save fearnworks/aae174f32d6d3218389fa29f21c2fecf to your computer and use it in GitHub Desktop.

OMI All Hands – DWG Update – 03/27/2025

OMI Data Pipeline

  • Discussion around central repo ux with kent. Cheezy has begun refactor of front page
  • Jimmy returns! Was without internet for a few weeks.
  • Several PR's out there that need review.
  • Could still use more help on central repo.
  • Dr. Head providing some cool resizing utils for the pipelines, currently working on integrating

Merged

  • #184: Auth page and header design update
  • #178: Update JSONL test image urls
  • #177: Jsonl upload prototype

Peer Review

  • #186: Keep additional metadata information
  • #185: Fixes to aws infrastructure

Graphcap

  • We are in alpha testing! Currently working to onboard folks. Sign up at : (Alpha Test Form)[https://docs.google.com/forms/d/e/1FAIpQLSezl0Z2hPnW8zeNeCdzINiQWhwhR52yEIoCpwUUz2J7GlAiBw/viewform?usp=sharing]
  • Current primary focus is on setting up the system to handle distributed compute. Currently looking at an event based system with kafka.
  • Reworking the inference server to be an inference bridge to be completely devoid of state / configuration knowledge so it works well with distributed loads and setups.
  • Currently utilizing browser for caption storage while we sort out the distributed storage flow
  • Ollama support now working, their function calling and structured output is pretty rough compared to others, but it works. I get failures in place I don't in vllm. So more retry logic is needed.

I'll include some more info on the current architecture at the end of this update.

New Stuff

Initial UI Merged to main

Contains basic perspective captioning

image

Perspective Management

image image

Repository & Project Status

Merged:

  • #27 : Graphcap Alpha Client
  • #28 : Perspective Library Wizard

Active

  • #32 : Provider Config & Inference Bridge

On Deck

  • Fixes for dataset upload
  • Content index pipeline & save to Postgres
  • Save annotations in Postgres
  • UX/Onboarding tweaks
  • Batch captioning / Synthesizer Workflow

Architecture & Technical Direction

Current Architecture

Graphcap follows a modular, service-based architecture designed for small to medium deployments with a local-first approach:

  • React Client: Serves as both the UI and system orchestrator
  • Data Service: Manages database operations via PostgreSQL
  • Inference Bridge: Stateless service that performs AI captioning
  • Media Server: Handles image storage and processing
┌─────────────────────────────────────┐
│           React Client              │
│           (Orchestrator)            │
└───────┬─────────┬─────────┬─────────┘
        │         │         │
        ▼         ▼         ▼
┌───────────┐ ┌───────────┐ ┌─────────────┐
│  Data     │ │ Inference │ │ Media Server│
│  Service  │ │ Bridge    │ │             │
└─────┬─────┘ └─────┬─────┘ └──────┬──────┘
      │             │              │
      ▼             │              │
┌──────────┐        │              │
│PostgreSQL│        │              │
└──────────┘        │              │
                    │              │
                    ▼              ▼
             ┌─────────────────────────┐
             │  Workspace Volume       │
             │  (Shared Storage)       │
             └─────────────────────────┘

Evolution with Kafka

As we scale beyond single-machine deployments, we're integrating Kafka to connect system components and enable distributed processing:

┌─────────────────┐                         ┌─────────────────┐
│                 │                         │                 │
│  React Client   │─────────────────────────│ Inference Bridge│
│  (Orchestrator) │                         │     Pool        │
│                 │◀────────────┐           └────────┬────────┘
└─────┬───────────┘             │                    │
      │                         │                    │
      │                         │                    │
      │                         │                    │ Process
      │                         │                    │ Requests
      │                         │                    │
      ▼                         │                    ▼
┌─────────────────┐    ┌────────┴────────┐    ┌─────────────────┐
│                 │    │                 │    │                 │
│  Data Service   │───▶│    Kafka        │◀──│  Inference      │
│                 │    │  Event Bus      │    │  Results        │
└─────┬───────────┘    └────────┬────────┘    └─────────────────┘
      │                         │                                
      │                         │                                
      ▼                         │                               
┌─────────────────┐             │                               
│   PostgreSQL    │◀────────────┘                               
│                 │                                            
└─────────────────┘                                             

Why Kafka for Our Perspective System

Using Kafka as our event backbone provides several advantages:

  1. Scalable Perspective Processing: Multiple Inference Bridges can process perspectives in parallel

  2. Decoupled Components: Services can scale independently based on demand

  3. Support for Perspective Pipelines: Process complex perspective chains (like generating several perspective results to generate synthesized captions across all images in a dataset)

  4. Batch Job Processing: Our batch system can queue large numbers of images and perspectives for processing across distributed workers

  5. Fault Tolerance: Work can be retried and resumed if any component fails

The batch job system we're building will track progress in PostgreSQL while allowing distributed processing through Kafka, making it possible to process thousands of images with multiple perspectives efficiently.

As we continue alpha testing, this architecture will help us move from browser-based storage to a fully distributed system that maintains performance as we scale.

Topic Structure

Kafka Topics for Graphcap

Core Event Topics

┌─────────────────────────────────────────────────────────────┐
│                     CAPTION PROCESSING                      │
├─────────────────────────────────────────────────────────────┤
│ captions.request                                            │
│ - Single image caption requests                             │
│ - Includes image path, perspective, provider info           │
├─────────────────────────────────────────────────────────────┤
│ captions.result                                             │
│ - Completed caption results                                 │
│ - Contains structured output based on perspective schema    │
├─────────────────────────────────────────────────────────────┤
│ captions.failed                                             │
│ - Failed caption attempts                                   │
│ - Includes error details for retry or reporting             │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                       BATCH PROCESSING                      │
├─────────────────────────────────────────────────────────────┤
│ jobs.created                                                │
│ - New batch job metadata                                    │
│ - Includes job type, configuration, priority                │
├─────────────────────────────────────────────────────────────┤
│ jobs.status                                                 │
│ - Job status changes (running, completed, failed)           │
│ - Progress updates and statistics                           │
├─────────────────────────────────────────────────────────────┤
│ job_items.pending                                           │
│ - Individual work items ready for processing                │
│ - High volume, partitioned by image collections             │
├─────────────────────────────────────────────────────────────┤
│ job_items.completed                                         │
│ - Processed work items with results                         │
├─────────────────────────────────────────────────────────────┤
│ job_items.failed                                            │
│ - Failed work items with error information                  │
└─────────────────────────────────────────────────────────────┘

Supporting Topics

┌─────────────────────────────────────────────────────────────┐
│                      MEDIA MANAGEMENT                       │
├─────────────────────────────────────────────────────────────┤
│ media.uploaded                                              │
│ - New images added to the system                            │
│ - Triggers preprocessing and thumbnail generation           │
├─────────────────────────────────────────────────────────────┤
│ media.processed                                             │
│ - Images that have completed preprocessing                  │
│ - Ready for captioning                                      │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                       SYSTEM EVENTS                         │
├─────────────────────────────────────────────────────────────┤
│ system.logs                                                 │
│ - Centralized logging from all components                   │
│ - Structured with service name, level, message              │
├─────────────────────────────────────────────────────────────┤
│ system.metrics                                              │
│ - Performance and health metrics                            │
│ - Aggregated for monitoring and dashboards                  │
├─────────────────────────────────────────────────────────────┤
│ perspectives.updated                                        │
│ - Changes to perspective library                            │
│ - Triggers inference bridge reloads                         │
└─────────────────────────────────────────────────────────────┘

Topic Usage Patterns

For distributed captioning workflow:

  1. React Client or batch job system publishes to captions.request
  2. Inference Bridges consume from captions.request (partitioned for parallel processing)
  3. After processing, bridges publish to captions.result or captions.failed
  4. Data Service consumes results and updates database
  5. React Client (or notification system) receives updates via jobs.status

For perspective dependency chains:

  1. Use job_items.completed as trigger for dependent perspectives
  2. Chain processors subscribe to job_items.completed and look for prerequisite perspectives
  3. When prerequisites are found, new requests are published to captions.request
  4. Chain recorded in batch system via batchJobDependencies table

This topic structure balances:

  • Throughput (high-volume topics partitioned for scale)
  • Organization (logical grouping by domain)
  • Observability (dedicated topics for failures and monitoring)
  • Flexibility (supporting different usage patterns)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment