Migrating from Synchronous HTTP API to Kafka
Accepted
2025-03-10
Our microservices currently communicate via synchronous HTTP APIs, causing latency issues and occasional disruptions when one service is unavailable. Also, the cost to handle the entire traffic is very high. Most of the communication, especially reads, don't require synchronous flow. We also anticipate a need to handle significantly higher request volumes in the near future. To increase resiliency, scalability and cost-efficiency, an asynchronous communication would be preferred.
We will transition from synchronous HTTP API calls to a Kafka-based event-driven architecture for communication between our microservices.
-
Scalability: Kafka’s event-driven model allows simple horizontal scaling of consumers, which is critical for our anticipated traffic growth and is also very cost-efficient.
-
Resilience: Asynchronous messaging decouples microservices, so one service’s downtime doesn’t cascade throughout the system, which is especially important for writes/commands. That will allow us to take advantage of Saga pattern.
-
Cost-efficiency: A simple proof of concept indicates that just 3 Kafka consumers can read the equivalent amount of data as 25 Sidekiq workers reading from HTTP API. Also, it implies that we will be able to scale down web workers of the upstream service by 40% as we won't be reading this data from the HTP API.
-
Operational Overhead: We need to maintain a Kafka cluster, which introduces new complexity for monitoring, alerting, and administration. Amazon MSK service can be a great solution here.
-
Kafka Learning Curve: Engineers will need to gain familiarity with event-driven design patterns and Kafka itself.
-
Deployment and Migration Plan: We’ll roll out event streams incrementally to avoid a “big bang” migration. Secondary microservices will be adapted first, followed by the more critical ones.
-
Continue with Synchronous HTTP: Would be simpler to maintain, but scalability, resiliency and cost-efficiency trade-offs are not acceptable in the long run.
-
Use a Different Message Broker (e.g.RabbitMQ): While viable, Kafka’s persistence and proven track record with large-scale event processing made it more appealing.