This is an example deployment that integrates aiohttp with Ray Serve Handle.
Steps:
ray start --head
- This starts a ray cluster in the background.
python deploy_serve.py
- This deploys two endpoints to Ray Serve: model_a and model_b
gunicorn aiohttp_app:app --worker-class aiohttp.GunicornWebWorker --workers 2 --bind localhost:8001
- This uses gunicorn to start two workers of the aiohttp app.
- We are binding to port 8001 because Ray Dashboard uses 8000. But you can change the Ray Dashboard port with
ray start --dashboard-port XXXX
curl localhost:8001/single
- should return
[model_a dummy input] - This shows that you can call the aiohttp endpoint, which calls model_a.
curl localhost:8001/chain
- should return
[model_b [model_a dummy input]] - This shoulds that you can call another aiohttp endpoint that chains two endpoint in serve together.
Cleaning up:
- You can call
ray stopto clean up the background cluster.