FIRST
- user will upload file to ROR -> upload S3 -> create entity in DB
 - push job to sidekiq for chunking of the entity
 
SECOND
- process chunk generation async job in lambda
- call lambda/chunk/generate?url=
- parser url (html, md, pdf, docs, text, XML sitemap) based on MIME Type from content-type
 - text extraction
 - chunking NLTK based limit 256 token length
 - return chunks
 
 - save all chunks to DB
 - push job to sidekiq to embed chunks
 
 - call lambda/chunk/generate?url=
 
THIRD
- process embeddings generation async job in lambda
- call lambda/embeddings/generate?content=
- generate embeddings using sentence transformer Mini-LM-6 V2
 - return embeddings
 
 - save embeddings to chunk entity in DB and update status
 
 - call lambda/embeddings/generate?content=