Skip to content

Instantly share code, notes, and snippets.

@kallebysantos
Last active November 3, 2024 21:52
Show Gist options
  • Save kallebysantos/49b71471b518c01c64e5fb860e096693 to your computer and use it in GitHub Desktop.
Save kallebysantos/49b71471b518c01c64e5fb860e096693 to your computer and use it in GitHub Desktop.
Running Local AI models with FastAPI and Vercel AI SDK
2023-10-20.22-57-29.mp4

AI models with FastAPI Streaming and Vercel AI SDK

Getting started

First of all we need to create our project structure, start by creating a backend and frontend folders:

├── .env
├── backend/
└── frontend/

Setting up our api:

# backend/main.py

import uvicorn

from os import getenv, path
from dotenv import load_dotenv

from fastapi import BackgroundTasks, FastAPI, Request, Response

app_base = path.dirname(__file__)
app_root = path.join(app_base, '../')

load_dotenv(dotenv_path=path.join(app_root, '.env'))

app_host = getenv("APP_HTTP_HOST")
app_port = int(getenv("APP_HTTP_PORT"))

app = FastAPI()

@app.get("/api/reply")
def reply(value: str):
    print(f"reply: {value}")
    return {"reply": value}

if __name__ == "__main__":
    uvicorn.run("main:app", host=app_host, reload=True, port=app_port)

Create a new vite app inside the frontend folder:

yarn create vite --template react-ts

Setting up SPA Integration:

After creating the back and front projects we neew to combine them together, the easiest way is serve our frontend app as static file from our backend. Also we should setup a SPA Proxy for developing propourses.

1. Setting Up environment variables:

Inside the .env file we can place common environment variables for both apps.

APP_ENVIRONMENT='Development'
APP_HTTP_HOST='127.0.0.1'
APP_HTTP_PORT='5000'
APP_HTTP_URL='http://${APP_HTTP_HOST}:${APP_HTTP_PORT}'

APP_SPA_PROXY_PORT='3000'
APP_SPA_PROXY_URL='http://${APP_HTTP_HOST}:${APP_SPA_PROXY_PORT}'
APP_SPA_FOLDER_ROOT='frontend'
APP_SPA_PROXY_LAUNCH_CMD='yarn dev --port ${APP_SPA_PROXY_PORT}'

2. Configuring vite request proxy:

We can now tell vite to proxy all fetch requests to our api endpoint, so that we can to api calls without specify the host server. So our frontend app will behave as hosted by the backend app, even in development environment.

Inside the frontend/vite.config.ts add the following:

import { env } from "node:process";
import { defineConfig } from "vite";
import react from "@vitejs/plugin-react";

const apiProxyTarget = env.APP_HTTP_URL;

export default defineConfig({
  define: {
    "process.env": process.env,
    _WORKLET: false,
    __DEV__: env.DEV,
    global: {},
  },
  plugins: [react()],
  server: {
    strictPort: true,
    proxy: {
      "/api": {
        target: apiProxyTarget,
        changeOrigin: true,
        secure: false,
        rewrite: (path) => path.replace(/^\/api/, "/api"),
      },
    },
  },
});

We can test api calls by adding a fetch request inside our App.tsx:

import "./App.css";
import { useEffect, useState } from "react";

function App() {
  const [apiResponse, setApiResponse] = useState("");

  useEffect(() => {
    fetch("/api/reply?value=Hello from React App!")
      .then((response) => response.json())
      .then((result) => setApiResponse(JSON.stringify(result)));
  }, []);

  return (
    <div>
      <code>{apiResponse}</code>
    </div>
  );
}

export default App;

3. Serving the a SPA app from FastAPI:

In this step we need to setup our backend to serve the react app as static files in production and proxy when is development.

Let's update the backend/main.py

import subprocess
import uvicorn

from os import getenv, path
from dotenv import load_dotenv

from fastapi import FastAPI, Request
from fastapi.responses import RedirectResponse
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles

app_base = path.dirname(__file__)
app_root = path.join(app_base, '../')
app_public = path.join(app_base, "public/")

load_dotenv(dotenv_path=path.join(app_root, '.env'))

app_env = getenv("APP_ENVIRONMENT")
app_host = getenv("APP_HTTP_HOST")
app_port = int(getenv("APP_HTTP_PORT"))
app_spa_folder = path.join(app_root, getenv("APP_SPA_FOLDER_ROOT"))
app_spa_proxy_url = getenv("APP_SPA_PROXY_URL")
app_spa_proxy_launch_cmd = getenv("APP_SPA_PROXY_LAUNCH_CMD")


app = FastAPI()
templates = Jinja2Templates(directory=app_public)
app.mount("/public", StaticFiles(directory=app_public), name="public")


@app.get("/api/reply")
def reply(value: str):
    print(f"reply: {value}")
    return {"reply": value}


@app.get("/{full_path:path}")
async def serve_spa_app(request: Request, full_path: str):
    """Serve the react app
    `full_path` variable is necessary to serve each possible endpoint with
    `index.html` file in order to be compatible with `react-router-dom
    """
    if app_env.lower() == "development":
        return RedirectResponse(app_spa_proxy_url)

    return templates.TemplateResponse("index.html", {"request": request})


if __name__ == "__main__":

    # Launching the SPA proxy server
    if app_env.lower() == "development":
        print("Launching the SPA proxy server...", app_spa_folder)
        spa_process = subprocess.Popen(
            args=app_spa_proxy_launch_cmd.split(" "),
            cwd=app_spa_folder)

    uvicorn.run("main:app", host=app_host, reload=True, port=app_port)

That its, now when we run our backend app with python main.py it will also launch our frontend app in develop mode.

Also will do fallback all non api routes to our SPA app.

Note: In production, we need to publish our frontend's dist files to the public folder inside our backend.

You can setup it inside your Dockerfile during some CI/CD routine.

Streaming AI results from FastAPI:

Now let's add some AI feature to our backend/main.py

# Others Imports...

from pydantic import BaseModel
from langchain.llms.openai import OpenAI
from fastapi.responses import StreamingResponse

app = FastAPI()

llm = OpenAI(
    streaming=True,
    verbose=True,
    temperature=0,
    openai_api_key=getenv("OPENAI_API_KEY")
)

class Question(BaseModel):
    prompt: str

@app.post('/api/ask')
async def ask(question: Question):
    print(question)

    def generator(prompt: str):
        for item in llm.stream(prompt):
            yield item

    return StreamingResponse(
        generator(question.prompt), media_type='text/event-stream')

# More code...

That its, we have an AI Asking endpoint with streaming support.

Now lets integrate with our frontend app

First add the ai package

yarn add ai

Then update the App.tsx:

import "./App.css";
import { useCompletion } from "ai/react";

function App() {

  // Some code here ...

  const { input, completion, handleInputChange, handleSubmit } = useCompletion({
    api: "/api/ask",
    headers: {
      "Content-Type": "application/json",
    },
  });

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <label htmlFor="ask-input">Ask something:</label>
        <input id="ask-input" type="text" value={input} onChange={handleInputChange} />

        <button type="submit">POST</button>
      </form>

      <textarea value={completion} rows={20}></textarea>
    </div>
  );
}

export default App;
import "./App.css";
import { useEffect, useState } from "react";
import { useCompletion } from "ai/react";
function App() {
const [apiResponse, setApiResponse] = useState("");
useEffect(() => {
fetch("/api/reply?value=Hello from React App!")
.then((response) => response.json())
.then((result) => setApiResponse(JSON.stringify(result)));
}, []);
const { input, completion, handleInputChange, handleSubmit } = useCompletion({
api: "/api/ask",
headers: {
"Content-Type": "application/json",
},
});
return (
<div>
<code>{apiResponse}</code>
<form onSubmit={handleSubmit}>
<label htmlFor="ask-input"></label>
<input
id="ask-input"
type="text"
value={input}
onChange={handleInputChange}
/>
<button type="submit">POST</button>
</form>
<textarea value={completion} rows={20}></textarea>
</div>
);
}
export default App;
APP_ENVIRONMENT='Development'
APP_HTTP_HOST='127.0.0.1'
APP_HTTP_PORT='5000'
APP_HTTP_URL='http://${APP_HTTP_HOST}:${APP_HTTP_PORT}'
APP_SPA_PROXY_PORT='3000'
APP_SPA_PROXY_URL='http://${APP_HTTP_HOST}:${APP_SPA_PROXY_PORT}'
APP_SPA_FOLDER_ROOT='frontend'
APP_SPA_PROXY_LAUNCH_CMD='yarn dev --port ${APP_SPA_PROXY_PORT}'
OPENAI_API_KEY=""
import subprocess
from pydantic import BaseModel
import uvicorn
from os import getenv, path
from dotenv import load_dotenv
from fastapi import FastAPI, Request
from fastapi.responses import RedirectResponse, StreamingResponse
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
from langchain.llms.openai import OpenAI
app_base = path.dirname(__file__)
app_root = path.join(app_base, '../')
app_public = path.join(app_base, "public/")
load_dotenv(dotenv_path=path.join(app_root, '.env'))
app_env = getenv("APP_ENVIRONMENT")
app_host = getenv("APP_HTTP_HOST")
app_port = int(getenv("APP_HTTP_PORT"))
app_spa_folder = path.join(app_root, getenv("APP_SPA_FOLDER_ROOT"))
app_spa_proxy_url = getenv("APP_SPA_PROXY_URL")
app_spa_proxy_launch_cmd = getenv("APP_SPA_PROXY_LAUNCH_CMD")
class Question(BaseModel):
prompt: str
app = FastAPI()
templates = Jinja2Templates(directory=app_public)
app.mount("/public", StaticFiles(directory=app_public), name="public")
llm = OpenAI(
streaming=True,
verbose=True,
temperature=0,
openai_api_key=getenv("OPENAI_API_KEY")
)
@app.post('/api/ask')
async def ask(question: Question):
print(question)
def generator(prompt: str):
for item in llm.stream(prompt):
yield item
return StreamingResponse(
generator(question.prompt), media_type='text/event-stream')
@app.get("/api/reply")
def reply(value: str):
print(f"reply: {value}")
return {"reply": value}
@app.get("/{full_path:path}")
async def serve_spa_app(request: Request, full_path: str):
"""Serve the react app
`full_path` variable is necessary to serve each possible endpoint with
`index.html` file in order to be compatible with `react-router-dom
"""
if app_env.lower() == "development":
return RedirectResponse(app_spa_proxy_url)
return templates.TemplateResponse("index.html", {"request": request})
if __name__ == "__main__":
# Launching the SPA proxy server
if app_env.lower() == "development":
print("Launching the SPA proxy server...", app_spa_folder)
spa_process = subprocess.Popen(
args=app_spa_proxy_launch_cmd.split(" "),
cwd=app_spa_folder)
uvicorn.run("main:app", host=app_host, reload=True, port=app_port)
@katyyyyyydk
Copy link

katyyyyyydk commented Sep 20, 2024

Absolutely loved the article on running local AI models with FastAPI and the Vercel AI SDK! If you’re working on a project that requires quick and high-quality images, you must check out Getimg. Imagine having a genie for your image needs—just drop in a photo, tweak it, and voilà! You get stunning results instantly. Getimg.AI Review https://nudeaitools.com/review/getimg/ ? Definitely two thumbs up! It’s fast, user-friendly, and incredibly reliable, making it essential for any creative project. Don’t miss out on this amazing tool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment