Skip to content

Instantly share code, notes, and snippets.

@anushshukla
Created January 16, 2023 07:52
Show Gist options
  • Select an option

  • Save anushshukla/97ae5626fbe8d9fe98851d8e7bd46664 to your computer and use it in GitHub Desktop.

Select an option

Save anushshukla/97ae5626fbe8d9fe98851d8e7bd46664 to your computer and use it in GitHub Desktop.
System design questions

product feed ingestion design for an e-commerce platform

Requirements:

  • Product feeds are ingested into a kafka topic by several systems.
  • Product feed means all the inventory about products (quantity, price, name, brand, color, description,etc...)
  • We have to store them in a DB.
  • We have to expose one API which will give the details about products based on product id/color/category etc.
  • During the ingestion we have to do some processing, transformation etc

Product feed data examples

JSON{ "product":"..","name":"..", "color":"..", "description":"..", "price":.., "color":"", "size":".." }

JSON{ "product":"..","name":"..", "color":"..", "description":"..", "price":.., "approvals":"", "size":".." }

Questions:

Q: How can the system handle huge scale of data? A: Distributed system implementation - Clustered BE app process - Database clustered (master-slave) - Database paritioned / sharded - Queues partitioned and concurrency handling enabled

Q: How can system be design be fault tolerant and robust? (If any request processing fails due to any issue, it should be reprocessed) A: Implementing - Dead letter queue where failed messages will be sent to for retry in batches in every 2 hours with ideampotency using e-tag checksum or/and last-modified timestamp - Commit messages manually quickly for create / delete while synchronours for same entity updates - Retries 0 as retry would be implemented separately

Q: How to store in the DB, what kind of DB and why? A: Any NoSQL database as the requirement is read heavy without relational db requirement and the database should supporing full text search. Hence, ES (ElastiSearch) would be ideal

Q: What are the high level components? A: Queue -> Consumer -> Db -> Dead letter queue. Dead letter queue -> Cron job -> Db. Redis for cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment