First of all we need to create our project structure, start by creating a backend
and frontend
├── .env
├── backend/
└── frontend/
# backend/main.py
import uvicorn
from os import getenv, path
from dotenv import load_dotenv
from fastapi import BackgroundTasks, FastAPI, Request, Response
app_base = path.dirname(__file__)
app_root = path.join(app_base, '../')
load_dotenv(dotenv_path=path.join(app_root, '.env'))
app_host = getenv("APP_HTTP_HOST")
app_port = int(getenv("APP_HTTP_PORT"))
app = FastAPI()
def reply(value: str):
print(f"reply: {value}")
return {"reply": value}
if __name__ == "__main__":
uvicorn.run("main:app", host=app_host, reload=True, port=app_port)
yarn create vite --template react-ts
After creating the back and front projects we neew to combine them together, the easiest way is serve our
frontend app as static file
from our backend. Also we should setup a SPA Proxy
for developing propourses.
1. Setting Up environment variables:
Inside the .env
file we can place common environment variables for both apps.
2. Configuring vite request proxy:
We can now tell vite to proxy all fetch
requests to our api
so that we can to api calls without specify the host server.
So our frontend app will behave as hosted by the backend app, even in development environment.
Inside the frontend/vite.config.ts
add the following:
import { env } from "node:process";
import { defineConfig } from "vite";
import react from "@vitejs/plugin-react";
const apiProxyTarget = env.APP_HTTP_URL;
export default defineConfig({
define: {
"process.env": process.env,
_WORKLET: false,
__DEV__: env.DEV,
global: {},
plugins: [react()],
server: {
strictPort: true,
proxy: {
"/api": {
target: apiProxyTarget,
changeOrigin: true,
secure: false,
rewrite: (path) => path.replace(/^\/api/, "/api"),
We can test api calls by adding a fetch
request inside our App.tsx
import "./App.css";
import { useEffect, useState } from "react";
function App() {
const [apiResponse, setApiResponse] = useState("");
useEffect(() => {
fetch("/api/reply?value=Hello from React App!")
.then((response) => response.json())
.then((result) => setApiResponse(JSON.stringify(result)));
}, []);
return (
export default App;
3. Serving the a SPA app from FastAPI:
In this step we need to setup our backend to serve the react app as static files in production and proxy when is development.
Let's update the backend/main.py
import subprocess
import uvicorn
from os import getenv, path
from dotenv import load_dotenv
from fastapi import FastAPI, Request
from fastapi.responses import RedirectResponse
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
app_base = path.dirname(__file__)
app_root = path.join(app_base, '../')
app_public = path.join(app_base, "public/")
load_dotenv(dotenv_path=path.join(app_root, '.env'))
app_env = getenv("APP_ENVIRONMENT")
app_host = getenv("APP_HTTP_HOST")
app_port = int(getenv("APP_HTTP_PORT"))
app_spa_folder = path.join(app_root, getenv("APP_SPA_FOLDER_ROOT"))
app_spa_proxy_url = getenv("APP_SPA_PROXY_URL")
app_spa_proxy_launch_cmd = getenv("APP_SPA_PROXY_LAUNCH_CMD")
app = FastAPI()
templates = Jinja2Templates(directory=app_public)
app.mount("/public", StaticFiles(directory=app_public), name="public")
def reply(value: str):
print(f"reply: {value}")
return {"reply": value}
async def serve_spa_app(request: Request, full_path: str):
"""Serve the react app
`full_path` variable is necessary to serve each possible endpoint with
`index.html` file in order to be compatible with `react-router-dom
if app_env.lower() == "development":
return RedirectResponse(app_spa_proxy_url)
return templates.TemplateResponse("index.html", {"request": request})
if __name__ == "__main__":
# Launching the SPA proxy server
if app_env.lower() == "development":
print("Launching the SPA proxy server...", app_spa_folder)
spa_process = subprocess.Popen(
args=app_spa_proxy_launch_cmd.split(" "),
uvicorn.run("main:app", host=app_host, reload=True, port=app_port)
That its, now when we run our backend app with python main.py
it will also launch our frontend app in develop mode.
Also will do fallback all non api routes
to our SPA app.
Note: In production, we need to publish our frontend's
files to thepublic
folder inside our backend.You can setup it inside your
during someCI/CD
Now let's add some AI feature to our backend/main.py
# Others Imports...
from pydantic import BaseModel
from langchain.llms.openai import OpenAI
from fastapi.responses import StreamingResponse
app = FastAPI()
llm = OpenAI(
class Question(BaseModel):
prompt: str
async def ask(question: Question):
def generator(prompt: str):
for item in llm.stream(prompt):
yield item
return StreamingResponse(
generator(question.prompt), media_type='text/event-stream')
# More code...
That its, we have an AI Asking endpoint with streaming support.
Now lets integrate with our frontend app
First add the ai
yarn add ai
Then update the App.tsx
import "./App.css";
import { useCompletion } from "ai/react";
function App() {
// Some code here ...
const { input, completion, handleInputChange, handleSubmit } = useCompletion({
api: "/api/ask",
headers: {
"Content-Type": "application/json",
return (
<form onSubmit={handleSubmit}>
<label htmlFor="ask-input">Ask something:</label>
<input id="ask-input" type="text" value={input} onChange={handleInputChange} />
<button type="submit">POST</button>
<textarea value={completion} rows={20}></textarea>
export default App;
It's should work, but I probably do something like that:
Perform a GET request, searching by the sources. Then responding it back while I put the results in some cache data source, like Redis.
At this point I'll make a Unique identifier that corelates the user with the sources and generate a preassigned url of that.
In the frontend I show the resulted documents and just after that I perform a API call for the preassigned url, this url will point to my completion endpoint, where will use the preassigned values to get back the sources from cache.
Using this approach I can avoid to invoke my completion endpoint with no results are found, and also the client will have his documents earlier, since the AI completion could take so long to finish.
If you're not doing some extra operation over your sources, like highlight the most relevant phrases for the question, you can just store the source Id in your cache. But if you need to keep the whole result, may you should optimize that before store.
You could also relate common questions to same results to avoid deduplication.