Simon Mo simon-mo

Dynamic models in Ray Serve

This is an example of using Ray actor in Ray Serve to dynamically update the models. In the screencast below, I demostrated that you can dynamically register new model or update existing one with Python API. The updated model is immediately reflected in API call.

Gunicorn, Aiohttp, and Ray Serve

This is an example deployment that integrates aiohttp with Ray Serve Handle.

Steps:

ray start --head

This starts a ray cluster in the background.

python deploy_serve.py

This deploys two endpoints to Ray Serve: model_a and model_b

gunicorn aiohttp_app:app --worker-class aiohttp.GunicornWebWorker --workers 2 --bind localhost:8001

	#!/bin/bash

	# NOTE: Only working for Python 3.7 on MacOS.
	# NOTE: Please modify the wheel URL.
	# DASK_VERSION=("2021.5.0" "2021.4.1" "2021.4.0" "2021.3.1" "2021.2.0" "2021.1.1" "2020.12.0")
	DASK_VERSION=(
	"2021.7.0" "2021.6.2" "2021.6.1" "2021.6.0"
	"2021.5.1" "2021.5.0" "2021.4.1" "2021.4.0"
	"2021.3.1" "2021.2.0" "2021.1.1" "2020.12.0"
	)

	➜ /tmp python demo.py
	/Users/simonmo/miniconda3/envs/anyscale/lib/python3.7/site-packages/ray/autoscaler/_private/cli_logger.py:61: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via `pip install 'ray[default]'`. Please update your install command.
	"update your install command.", FutureWarning)
	2021-07-15 16:15:11,661 INFO services.py:1274 -- View the Ray dashboard at http://127.0.0.1:8265
	2021-07-15 16:15:31,716 WARNING worker.py:1123 -- The actor or task with ID ffffffffffffffffa557ac06103e58a24181157401000000 cannot be scheduled right now. It requires {CPU: 1.000000} for placement, but this node only has remaining {16.000000/16.000000 CPU, 23.675368 GiB/23.675368 GiB memory, 11.837684 GiB/11.837684 GiB object_store_memory, 1.000000/1.000000 node:192.168.1.69}
	. In total there are 0 pending tasks and 1 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, cons

	from fastapi import FastAPI
	from pydantic import BaseModel

	class User(BaseModel):
	name: str

	user = User(name='a')


	app = FastAPI()

	import ray
	import numpy as np
	import time
	import asyncio


	@ray.remote
	class Upstream:
	def __init__(self):
	self.q = asyncio.Queue()

	this_node_key = next(
	filter(
	lambda resource_key: resource_key.startswith("node:"),
	next(
	filter(
	lambda node: (
	node["NodeID"] == ray.get_runtime_context().node_id.hex()
	),
	ray.nodes(),
	)

	FROM rayproject/ray:nightly-cpu

	# Setup Nodejs and NPM
	RUN sudo apt-get update && sudo apt-get install -y curl
	RUN curl -sL https://deb.nodesource.com/setup_14.x \| sudo -E bash -
	RUN sudo apt-get install -y nodejs

	RUN mkdir -p $HOME/workspace
	RUN git clone https://github.com/ray-project/ray.git $HOME/workspace/ray
	WORKDIR $HOME/workspace

	cluster_name: default

	min_workers: 0
	max_workers: 5
	initial_workers: 0
	autoscaling_mode: default
	target_utilization_fraction: 1.0
	idle_timeout_minutes: 99999

	provider:

	import os
	import numpy as np
	import ray
	import asyncio
	from ray.cluster_utils import Cluster

	cluster = Cluster()
	head_node = cluster.add_node()
	child_node = cluster.add_node(resources={"OTHER_NODE": 100})