There are generally two ways of serving machine learning applications at scale. The first is wrapping your application in a traditional web server. This approach is easy but hard to scale each component, and easily leading to high memory usage as well as concurrency issue. The other approach is to use a cloud-hosted solution
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| {"config": {"view": {"continuousWidth": 400, "continuousHeight": 300}}, "layer": [{"mark": "line", "encoding": {"x": {"type": "ordinal", "field": "c", "sort": null}, "y": {"type": "quantitative", "aggregate": "mean", "field": "q"}}, "width": 800}, {"mark": "point", "encoding": {"tooltip": [{"type": "nominal", "field": "c"}, {"type": "quantitative", "field": "q"}], "x": {"type": "ordinal", "field": "c", "sort": null}, "y": {"type": "quantitative", "field": "q"}}, "width": 800}], "data": {"name": "data-2e12868c014f84ade6358c48b9cfb4f8"}, "$schema": "https://vega.github.io/schema/vega-lite/v4.8.1.json", "datasets": {"data-2e12868c014f84ade6358c48b9cfb4f8": [{"c": "be647b6", "q": 995.39}, {"c": "be647b6", "q": 961.65}, {"c": "be647b6", "q": 997.86}, {"c": "be647b6", "q": 956.24}, {"c": "be647b6", "q": 989.76}, {"c": "a25472c", "q": 1025.83}, {"c": "a25472c", "q": 1028.85}, {"c": "a25472c", "q": 1026.72}, {"c": "a25472c", "q": 1039.17}, {"c": "a25472c", "q": 1010.98}, {"c": "761b584", "q": 1722.63}, {"c": "761b584 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| WARNING: autodoc: failed to import function 'rllib.utils.annotations.PublicAPI' from module 'ray'; the following exception was raised: | |
| Traceback (most recent call last): | |
| File "/Users/simonmo/miniconda3/lib/python3.6/site-packages/sphinx/ext/autodoc/importer.py", line 32, in import_module | |
| return importlib.import_module(modname) | |
| File "/Users/simonmo/miniconda3/lib/python3.6/importlib/__init__.py", line 126, in import_module | |
| return _bootstrap._gcd_import(name[level:], package, level) | |
| File "<frozen importlib._bootstrap>", line 994, in _gcd_import | |
| File "<frozen importlib._bootstrap>", line 971, in _find_and_load | |
| File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked | |
| File "<frozen importlib._bootstrap>", line 665, in _load_unlocked |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| np.random.seed(42) | |
| def gamma(mean, cv, size): | |
| if cv == 0.0: | |
| return np.ones(size) * mean | |
| else: | |
| return np.random.gamma(1.0/cv, cv*mean, size=size) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Checking ===== BUILD.bazel | |
| Found command {'rule': 'core_worker-jni-darwin-compat', 'cmd': 'cp $< $@'} | |
| Found conditional {'rule': 'redis', 'condition': '@bazel_tools//src/conditions:windows', 'cmd': '\n unzip -q -o -- $(location @com_github_tporadowski_redis_bin//file) redis-server.exe redis-cli.exe &&\n mv -f -- redis-server.exe $(location redis-server) &&\n mv -f -- redis-cli.exe $(location redis-cli)\n '} | |
| Found conditional {'rule': 'redis', 'condition': '//conditions:default', 'cmd': '\n tmpdir="redis.tmp" &&\n path=$(location @com_github_antirez_redis//:file) &&\n cp -p -L -R -- "$${path%/*}" "$${tmpdir}" &&\n chmod +x "$${tmpdir}"/deps/jemalloc/configure &&\n parallel="$$(getconf _NPROCESSORS_ONLN || echo 1)"\n make -s -C "$${tmpdir}" -j"$${parallel}" V=0 CFLAGS="$${CFLAGS-} -DLUA_USE_MKSTEMP -Wno-pragmas -Wno-empty-body" &&\n mv "$${tmpdir}"/src/redis-server $(location redis-server) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from ray import serve | |
| import requests | |
| import time | |
| import ray | |
| class MultiMethodDemo: | |
| def method_a(self, flask_requests, *, keyword=[]): | |
| if serve.context.web: | |
| print("Method A: Got flask requests batch size", serve.context.batch_size) | |
| else: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| import torch.optim as optim | |
| import torchvision | |
| import torchvision.transforms as transforms | |
| import torch.nn as nn | |
| use_cuda = False | |
| if torch.cuda.is_available(): | |
| use_cuda = True |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
There are generally two ways of serving machine learning applications at scale. The first is wrapping your application in a traditional web server. This approach is easy but hard to scale each component, and easily leading to high memory usage as well as concurrency issue. The other approach is to use a cloud-hosted solution
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.