Chat-based AI apps often need to perform long-running operations (e.g. document generation, data analysis) that can’t block the user interaction. To keep the system responsive, these tasks must run in the background, outside the normal request–response cycle. A common pattern is to introduce an asynchronous task queue: the user’s request is immediately enqueued (e.g. returning “task accepted”) ([Background Tasks – FastAPI])([Using FastAPI with SocketIO to Display Real-Time Progress of Celery Tasks | by Fadi Shaar | Medium]), and a separate process (or cluster of workers) executes the job. This way, FastAPI can instantly handle further requests while heavy work proceeds independently, keeping the API responsive for other clients ([Using FastAPI with SocketIO to Display Real-Time Progress of Celery Tasks | by Fadi Shaar | Medium])([Background Tasks – FastAPI]).
For example, a contract-generation chatbot can immediately confirm receipt (“Document generation in progress”) and spin up a separate process for each contract. The user isn’t forced to wait and can ask new questions. At the same time, the system should inform them of progress—via notifications, a progress bar, or chat updates—to build trust and a sense of control. UX research shows that users tolerate long operations better when they run in the background with visible status updates: “When tasks run in the background, users are generally more tolerant of longer wait times” ([UX for Agents, Part 2: Ambient])([Easily Build a UI for Your AI Agent in Minutes (LangGraph + CopilotKit)⚡️ – DEV Community]).
Specialized queue-worker systems and message brokers power background tasking. Here are FastAPI-compatible options:
-
Celery (Python). The most popular task-queue library. FastAPI acts as the producer, while one or more Celery workers consume and execute tasks. Celery uses a broker (Redis or RabbitMQ) to store jobs and can also keep results in a backend. It supports task groups, chains, and chords—useful for parallel workflows like “generate five contracts at once” ([Asynchroniczna kolejka zadań…]([Asynchronous Tasks with FastAPI and Celery | TestDriven.io]))([Background Tasks – FastAPI]).
-
Redis Queue (RQ). A simpler Redis-based queue. Like Celery, you define Python functions as tasks that run in the background. It’s lighter but less feature-rich—ideal for smaller workloads.
-
Arq (Python). A modern asyncio-based alternative to RQ/Celery, using Redis as broker/backend. Because it’s built on Python’s
asyncio
, tasks are non-blocking and can run concurrently without spawning new processes for short jobs. Arq is notably faster than RQ for quick operations and keeps everything in one process, making it a convenient choice for full-async systems ([arq v0.26.3 documentation])([Background Tasks – FastAPI]). -
Apache Kafka. A distributed event streaming platform that can serve as your message channel: agents and services publish events (e.g. “generate_contract”), and consumers pick them up. Kafka excels at high-volume, multi-producer/consumer scenarios and offers commit-and-retry semantics, but requires its own cluster.
-
Temporal. A modern workflow engine for long-running business processes, with built-in queueing. You define workflows and activities in code and assign them to Task Queues. Temporal handles durability, retries, and state persistence between steps. It’s heavyweight to deploy but invaluable for complex, auditable orchestration ([Learn Temporal]).
-
Other options. Smaller libraries (Huey, Dramatiq), cloud queues (AWS SQS, Google Cloud Tasks), and brokers (RabbitMQ, NATS) exist, but for Python/FastAPI the ecosystem maturity often leads teams to choose Celery, Arq, or RQ.
FastAPI offers two main background-task patterns:
-
Built-in BackgroundTasks. A simple way to run lightweight jobs after returning a response (e.g. sending an email). Suitable only for short, non-CPU-bound work in the same process. For heavy or long-running jobs, the docs recommend “using a bigger tool like Celery” to avoid blocking the server’s worker threads ([Background Tasks – FastAPI]).
-
External Workers. A dedicated process or cluster (Celery/Redis or Arq) listens on your queue. FastAPI simply accepts requests and enqueues tasks—e.g. a POST
/generate-docs
endpoint sends a Celery job and returns HTTP 202 Accepted with{"task_id":123,"status":"queued"}
. The worker then processes the job independently, letting FastAPI remain fast for new requests ([Asynchronous Tasks with FastAPI and Celery | TestDriven.io])([Background Tasks – FastAPI]).
-
Enqueueing: The user asks the agent to run a task (e.g. “Generate 5 contracts”). FastAPI validates input, enqueues one or more tasks (
celery.send_task()
orawait arq_job.run()
), and immediately returns a 202 with{"task_id":…,"status":"queued"}
. The chat bot confirms acceptance. -
Progress Updates: Before completion, inform the user of progress. Common channels:
- WebSockets: FastAPI can host a WebSocket endpoint; clients open a socket on their
task_id
, and FastAPI listens for Celery/Arq events to push updates in real time ([GitHub issue #1409 – Websocket broadcasting with Celery events]). - SSE or Polling: The frontend polls a GET
/task-status?task_id=123
or subscribes to SSE for status. - Socket.IO: With
fastapi-socketio
, you can broadcast Celery progress events to clients for a smooth UI ([Using FastAPI with SocketIO… | Medium]).
- WebSockets: FastAPI can host a WebSocket endpoint; clients open a socket on their
-
Chat Updates: In a chat UI, you can update the original bot message (“Contracts: 0%”) or post new messages (“2 of 5 done”). This shared state pattern keeps the user informed in context, building trust by revealing each step ([Easily Build a UI… – DEV Community])([UX for Agents, Part 2: Ambient]).
-
Completion: When all tasks finish, the worker sends results (e.g.
.docx
files or links) back to FastAPI or directly over WebSocket. FastAPI persists outputs and notifies the frontend (e.g. posting “Contracts ready—download here”). A UI alert can draw the user’s attention.
-
Producer–Consumer Architecture: Decouple the web layer (FastAPI) from workers. Use a broker (Redis/RabbitMQ/Kafka) so tasks persist until a worker runs them. Celery clusters can scale workers dynamically.
-
Idempotency & Retry Safety: Tasks should be safe to run multiple times (especially with Arq’s pessimistic execution or Celery retries). Write tasks so repeated runs don’t corrupt data ([arq docs]).
-
Monitoring & Management: Use tools like Celery Flower or Arq’s CLI/dashboard to inspect queue lengths, task statuses, and worker health.
-
Routing & Priorities: Define multiple queues and priority routes so urgent tasks aren’t blocked by long jobs. Temporal similarly uses distinct Task Queue names per workflow.
-
Security & Isolation: Run tasks in isolated environments (containers or separate workers) when calling external services, so a crashed job doesn’t destabilize your API. Queued tasks also buffer against external outages.
For chat-agents, it’s crucial users see the system working rather than feeling stalled. UX guidelines:
-
Immediate Confirmation: As soon as a task is enqueued, the bot should say, “I’ve started generating 5 contracts. I’ll let you know when they’re ready—feel free to ask anything in the meantime.”
-
Progress Feedback: Display a progress bar or periodic messages (“Contracts: 50% complete”). Even without final output, send mid-process updates (“Now doing step X, 30% done”) to build trust and control ([UX for Agents, Part 2: Ambient])([Easily Build a UI… – DEV Community]).
-
Non-blocking Chat: Don’t disable the input field while background work runs. Users should continue asking questions; the bot weaves final results into the ongoing conversation when tasks finish.
-
UI Enhancements: Use chat threads or “cards” for each task, as in Slack’s threaded messages. In a React frontend, a progress-bar component or log window keeps users informed.
-
Human-in-the-Loop Controls: For advanced “ambient agent” scenarios, allow users to interrupt or modify running tasks (“Pause and update the contract’s payment terms”), giving real-time oversight ([UX for Agents, Part 2: Ambient]).
A typical flow for “generate 5 contracts”:
- Request: User: “Generate 5 contracts.” Bot/FastAPI replies, “Generating 5 contracts now—will notify you when done.” Enqueue 5 tasks with unique IDs.
- Acknowledgement: API returns HTTP 202; chat shows “Started contract generation.”
- Background Execution: Workers pick up tasks, generate each document, and emit progress events via Redis Pub/Sub or Celery events.
- Progress Notifications: FastAPI listens and pushes WebSocket updates (“Contract 1: 50%”). Chat updates reflect “2 of 5 in progress,” while user can keep chatting.
- Completion: After all 5 finish, system posts “Contracts ready” with download links and a summary (“5 documents in 2 minutes”).
- Continued Dialogue: User can ask for edits to contract 3 in the same thread, triggering a new background job.
To implement background queues in a chat-based AI app, separate long-running work from your HTTP layer by using mature queue systems (Celery, Arq, Kafka, Temporal). FastAPI serves as a lightweight producer, enqueuing tasks and returning immediate responses, while workers run jobs asynchronously. For excellent UX, provide instant confirmations and regular status updates via WebSocket/SSE/polling so the chat remains responsive and users feel in control ([Background Tasks – FastAPI])([Using FastAPI with SocketIO… | Medium])([Easily Build a UI… – DEV Community]). Experienced AI agent systems (e.g. LangChain-based) leverage these patterns to manage tool invocations and show “actions in progress” with real-time feedback ([UX for Agents, Part 2: Ambient])([Easily Build a UI… – DEV Community]).
Sources:
- FastAPI Background Tasks – FastAPI docs.
- Fadi Shaar, Using FastAPI with SocketIO to Display Real-Time Progress of Celery Tasks – Medium.
- arq v0.26.3 documentation.
- UX for Agents, Part 2: Ambient – LangChain blog.
- Easily Build a UI for Your AI Agent in Minutes (LangGraph + CopilotKit) – DEV Community.