Skip to content

Instantly share code, notes, and snippets.

@simon-mo
Last active October 26, 2020 21:03
Show Gist options
  • Select an option

  • Save simon-mo/b45005365530a1f2826182d37e2be1d0 to your computer and use it in GitHub Desktop.

Select an option

Save simon-mo/b45005365530a1f2826182d37e2be1d0 to your computer and use it in GitHub Desktop.

Gunicorn, Aiohttp, and Ray Serve

This is an example deployment that integrates aiohttp with Ray Serve Handle.

Steps:

  1. ray start --head
  • This starts a ray cluster in the background.
  1. python deploy_serve.py
  • This deploys two endpoints to Ray Serve: model_a and model_b
  1. gunicorn aiohttp_app:app --worker-class aiohttp.GunicornWebWorker --workers 2 --bind localhost:8001
  • This uses gunicorn to start two workers of the aiohttp app.
  • We are binding to port 8001 because Ray Dashboard uses 8000. But you can change the Ray Dashboard port with ray start --dashboard-port XXXX
  1. curl localhost:8001/single
  • should return [model_a dummy input]
  • This shows that you can call the aiohttp endpoint, which calls model_a.
  1. curl localhost:8001/chain
  • should return [model_b [model_a dummy input]]
  • This shoulds that you can call another aiohttp endpoint that chains two endpoint in serve together.

Cleaning up:

  • You can call ray stop to clean up the background cluster.
from aiohttp import web
import ray
from ray import serve
# Connect to the background Ray cluster
ray.init(address="auto")
client = serve.connect()
model_a = client.get_handle("model_a") # Returns ServeHandle object
model_b = client.get_handle("model_b")
async def single_model(request):
result = await model_a.remote("dummy input")
return web.Response(text=result)
async def chain_models(request):
# ServeHandle.remote immediately returns ObjectRef,
# which can be used for composition.
intermediate_object_ref = model_a.remote("dummy input")
# We are building a graph:
# input -> model_a -> model_b -> output
final_result = await model_b.remote(intermediate_object_ref)
return web.Response(text=final_result)
app = web.Application()
app.add_routes([web.get('/single', single_model),
web.get('/chain', chain_models)])
if __name__ == '__main__':
web.run_app(app)
import ray
from ray import serve
ray.init(address="auto")
client = serve.start(detached=True)
# You can define a function, similar to Ray Task.
def model_a(request):
return f"[model_a {request.data}]"
# You can also define a class, similar to Ray Actor.
class ModelB:
def __init__(self):
self.identity = "model_b"
def __call__(self, request):
return f"[{self.identity} {request.data}]"
client.create_backend("model_a:v0", model_a)
client.create_endpoint("model_a", backend="model_a:v0", route="/model_a")
client.create_backend("model_b:v0", ModelB)
# Notice that you don't need to register an HTTP route!
# Endpoint model_b is now only Python reachable.
client.create_endpoint("model_b", backend="model_b:v0")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment