serve-1.0-migration-guide.md

from ray import serve class MyBackend: def __call__(self, flask_request): ... serve.init() serve.create_backend("backend:v1", MyBackend) serve.create_endpoint("endpoint", backend="backend:v1") serve.set_traffic(...)

- serve.init() + client = serve.start() - serve.create_backend("backend:v1", MyBackend) + client.create_backend("backend:v1", MyBackend) - serve.create_endpoint("endpoint", backend="backend:v1") + client.create_endpoint("endpoint", backend="backend:v1") - serve.set_traffic(...) + client.set_traffic(...)

Serve

New APIs

serve.client API makes it easy and appropriately manage lifetime for multiple Serve clusters. (#10409)
- This is a breaking change. Please see more in our migration guide for steps to update your existing applications.
- You should move serve.init -> serve.start/connect and call API methods on the client objects returned by serve.start/connect.
- This gives you ability to specify a cluster wide Serve instance via serve.start(detached=True) and later connect to it via serve.connect(), or using serve.start() as a default, an ephemeral cluster will be started and teardown when the Python scripts exit.
ServeHandle API was revamped. (#10527, #10483)
- Your callable only needs to accept a single argument request instead of multiple keyword arguments.
- The request will be a Flask.Request if it's coming from web and ServeRequest if it's from Python ServeHandle. ServeRequest has similar API as the Flask request (e.g. request.args, request.data, request.json).
- When you pass in arguments to handle.remote, the keyword arguments gets injected into request.args and the first position argument gets injected into request.data or request.json.
ASGI middleware support: you can enable CORS and any Starlette middlewares by adding them to serve.start(http_middleware=[...]). (#10529, #9940)

API removal

serve.metric module is removed, along side with serve.stat, serve.init(metric_exporter=...) API. Serve now export metrics in Prometheus format through Ray's built-in metrics exporter. Backend latency histogram and router queue sizes are added. (#10185, 10535)
SLO ordering code path is removed. relative_slo_ms and absolute_slo_ms arguments for HTTP and ServeHandle are removed. (#10075)

Improvements

Serve APIs are fully typed. (#10205, #10288)
Backend configs are now typed and validated via Pydantic. (#10559, #10389)
Progress towards application level backend autoscaler. (#9955, #9845, #9828)
New architecture page in documentation. (#10204)

simon-mo/serve-1.0-migration-guide.md

Select an option

No results found

Select an option

No results found

Serve migration guide

Applies to all Serve users. Refactor your Serve API call.

Applies to batching, ServeHandle users.

Serve

New APIs

API removal

Improvements