Skip to content

Instantly share code, notes, and snippets.

@austinthecoder
Created January 31, 2026 13:58
Show Gist options
  • Select an option

  • Save austinthecoder/edff0448aed3f77f2403521c7b4d2645 to your computer and use it in GitHub Desktop.

Select an option

Save austinthecoder/edff0448aed3f77f2403521c7b4d2645 to your computer and use it in GitHub Desktop.

Architectural Proposal: Hybrid Messaging Strategy for AppX

Executive Summary

As we build AppX—our new service for managing reference data such as movies, TV shows, and people (actors, directors, writers)—we have an opportunity to evaluate our inter-service communication patterns and consider an approach that better serves this use case.

This proposal recommends that AppX expose a query API alongside event publication, adopting a hybrid messaging strategy that combines the strengths of push-based events with on-demand data access. This approach aligns with industry-standard patterns for reference data services and addresses several operational challenges we've observed in our current architecture.

The recommendation is not to abandon our event-driven architecture, but to evolve it. Events remain valuable for notifications and real-time updates. However, for reference data that many services need to query across multiple dimensions, a complementary API provides significant operational and reliability benefits.

A key principle motivates this proposal: data alone isn't valuable—applications are. The value of any service isn't in the rows and columns it stores, but in how it interprets, validates, derives, and exposes that data. Business logic, temporal reasoning, calculated fields, domain expertise—these are what make an application worth building. When we publish raw data events, we're essentially saying "here's the data, you figure out what it means." Every consumer must reverse-engineer the application's intelligence or operate without it.

Consider a simple test: if raw data were truly the valuable part, we could just spin up read replicas and let other apps query the database directly. But nobody advocates for that—because we instinctively understand that direct database access is insufficient. An API isn't just a "nicer interface" to the same data; it's the application's intelligence made accessible. Events-only architecture lands somewhere in between: better than raw database access, but still distributing data stripped of application context.


Current Architecture Overview

Our current inter-service communication follows a push-only pattern:

  • Services publish events to AMQP message buses when data changes
  • Consumer applications subscribe to relevant event streams
  • Each consumer maintains a local copy of the data it needs
  • There are no query APIs between services; all data access is local

This architecture was designed to achieve loose coupling, high availability, and independent deployability. These remain valuable goals that any proposed changes should preserve.


Observed Challenges

Through operating within this architecture, we've identified several challenges that become particularly acute for reference/master data like what AppX will manage.

Data Duplication at Scale

Each consuming application maintains its own copy of shared data. For widely-referenced data like teams and players, this means:

  • N copies of the same data across N consumers
  • N implementations of storage, indexing, and query logic
  • N potential sources of drift or inconsistency
  • Multiplied storage costs across the organization

Bootstrap Complexity

When a new application comes online—or an existing application needs to rebuild its local state—there is no straightforward mechanism to obtain historical data. Applications can only begin accumulating data from the point they start subscribing. This creates:

  • Extended onboarding timelines for new consumers
  • Complex coordination when rebuilding or recovering state
  • Dependency on ad-hoc data export/import processes

The Historical Data Gap

Applications can only store events they were designed to capture. When requirements evolve and an application needs data it didn't previously store, that historical data is irrecoverable through the event stream. This creates an irreversible gap that compounds over time.

Query Flexibility Constraints

Different consumers often need to access the same data through different query patterns—by ID, by relationship, by date range, by search criteria. In a push-only model, each consumer must independently implement every query pattern it might need, often duplicating significant engineering effort.


Industry Context

Our current approach draws inspiration from event sourcing, but it's worth distinguishing between event sourcing within a domain boundary versus between domains.

Event Sourcing Within a Domain

Event sourcing is highly effective when a single service or bounded context uses events as its internal source of truth. The service owns its event store, can replay events at will, and maintains complete control over its data lifecycle.

Event Sourcing Between Domains

Using events as the sole mechanism for data sharing between independent services is a different pattern with different tradeoffs. In this model, each consumer becomes responsible for maintaining derived state from events they don't control—essentially operating distributed data lakes.

Industry Patterns

Research into how organizations handle inter-service data sharing reveals that pure push-only architectures are rare. Most successful distributed systems use a hybrid approach:

  • Events for notifications: "Something changed, you may want to know"
  • APIs for data access: "Here's the current/historical state when you need it"

This pattern appears consistently across organizations operating large-scale distributed systems. Events and APIs serve complementary purposes rather than competing ones.


Proposal: A Hybrid Approach for AppX

Given that AppX will manage reference data consumed by many services across the organization, we propose that AppX implement both event publication and a query API.

Why AppX Is an Ideal Candidate

AppX's characteristics make it particularly well-suited for a query API:

Characteristic Implication
Reference data Stable, authoritative data that many services need
Multiple consumers High value from centralizing query logic
Multi-dimensional access Consumers need to query by ID, by title, by person, by date, by search terms
Historical relevance "What was the cast when this film was in production?" is a legitimate query
Moderate velocity Not a high-frequency event stream; changes are manageable

Why Events Alone Are Insufficient for AppX

Beyond the general challenges of push-only architectures, AppX's domain model has specific characteristics that make events alone inadequate for serving consumers.

Business Logic Duplication

When AppX publishes raw data events, consuming applications must replicate AppX's business logic to derive meaningful information. If AppX determines that a title is "currently streaming" based on licensing windows, regional availability, and platform agreements, every consumer must implement that same logic—or accept raw data they can't fully interpret. An API allows AppX to expose derived conclusions, not just raw inputs.

Temporal Data Limitations

Events are point-in-time snapshots. They capture what changed but not the full temporal richness of the domain model. AppX's consumers may need queries like "which actors were attached to this project in October?" or "show me this actor's complete filmography." Events can technically support this if every consumer builds temporal query infrastructure—but that's exactly the duplication we should avoid for reference data.

Derived and Calculated Data

AppX's domain model may include data that's calculated, aggregated, or derived from multiple sources—statistics, rankings, composite statuses. Events naturally contain raw data changes, not derived results. An API can expose calculated fields that would be impractical to distribute via events.

Silent Data Corrections

AppX may need to correct data directly in its database: bug fixes, data quality improvements, manual corrections for edge cases. These corrections often wouldn't generate events—and shouldn't, since they're not domain events but data hygiene. Without an API, consumers would have no way to receive corrected data; they'd be permanently stuck with the pre-correction values they captured from events.

Consumer Patterns in a Hybrid Model

Consumers can choose the pattern that best fits their needs:

  1. Query-only: Services that need occasional lookups can query AppX directly without maintaining local state.

  2. Event-driven with query fallback: Services can consume events for real-time updates while using the API for bootstrap, recovery, or historical queries.

  3. Cache with event invalidation: Services can cache query results locally and use events as cache invalidation signals.

This flexibility allows each consumer to optimize for their specific requirements rather than forcing a one-size-fits-all approach.


Addressing Concerns

Concern: Performance and Scalability

"If AppX has an API, it becomes a bottleneck that everyone queries."

Response:

Reference data is exceptionally well-suited for caching and horizontal scaling:

  • Cacheability: Titles, people, and studios don't change frequently. Application-level caching can serve the vast majority of requests without hitting the origin.

  • Read replicas: The API can be backed by read replicas that scale horizontally with demand.

  • Query patterns are predictable: "Get person by ID" and "Get title cast" are bounded, optimizable queries—not open-ended analytics.

  • Comparison to current state: Today, we have N services each implementing storage, indexing, and query logic. A centralized API with proper caching is often more efficient than distributed duplication.

Concern: Maintenance Burden

"AppX has a small team. Adding an API increases their maintenance load."

Response:

An API can actually reduce total organizational maintenance burden:

  • Centralized logic: Query logic is implemented once, not N times across N consumers.

  • Clear contracts: A well-documented API with versioning is easier to support than ad-hoc "why is my local copy wrong?" debugging across multiple teams.

  • Reduced support burden: Today, data issues require coordinating with each affected consumer. With an API, consumers can self-serve correct data.

  • Standard tooling: API frameworks, monitoring, and documentation are mature. We're not inventing new infrastructure.

The maintenance cost shifts from distributed debugging to centralized API stewardship—a trade that typically favors the API approach for widely-shared data.

Concern: Client SDKs Could Encapsulate Business Logic

"Instead of an API, we could distribute a client SDK that handles receiving and storing events, and provides query methods with the business logic built in."

Response:

An SDK can encapsulate business logic, but it trades one problem for another: instead of N teams duplicating business logic, you have N teams coordinating SDK versions.

  • Version drift: Consumers will inevitably run different SDK versions. A bug fix in the business logic requires every consumer to upgrade and redeploy—not just AppX. You've created deployment coupling across the organization.

  • It doesn't solve the harder problems: An SDK can encapsulate event handling and local query logic, but it doesn't help with bootstrap/historical data (the SDK can't conjure events from before you subscribed), silent data corrections (local stores still have stale data), or temporal queries (the SDK would need to replicate temporal infrastructure in every consumer's local store).

More importantly, an SDK and an API aren't mutually exclusive. An SDK could be a thin client over AppX's API—or even query a read replica directly. This gives consumers a nice developer experience (handles auth, caching, retries) while the API or read replica provides the authoritative source that solves bootstrap, historical data, and correction problems.

The SDK becomes a consumption convenience layer. The API is what makes that layer possible.


What This Is Not

To be clear about scope:

  • This is not a proposal to abandon events. Events remain valuable for real-time notifications and workflows that genuinely need push semantics.

  • This is not a proposal to add APIs to every service. AppX is a specific case where reference data characteristics make an API particularly valuable.

  • This is not a criticism of past decisions. Architectural patterns that made sense at one scale or context may need to evolve. This is normal and healthy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment