Skip to content

Instantly share code, notes, and snippets.

@simon-mo
Created July 15, 2021 23:53
Show Gist options
  • Select an option

  • Save simon-mo/54a5e5da1275d60d69abf188ba65cbe3 to your computer and use it in GitHub Desktop.

Select an option

Save simon-mo/54a5e5da1275d60d69abf188ba65cbe3 to your computer and use it in GitHub Desktop.
➜ /tmp python demo.py
/Users/simonmo/miniconda3/envs/anyscale/lib/python3.7/site-packages/ray/autoscaler/_private/cli_logger.py:61: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via `pip install 'ray[default]'`. Please update your install command.
"update your install command.", FutureWarning)
2021-07-15 16:15:11,661 INFO services.py:1274 -- View the Ray dashboard at http://127.0.0.1:8265
2021-07-15 16:15:31,716 WARNING worker.py:1123 -- The actor or task with ID ffffffffffffffffa557ac06103e58a24181157401000000 cannot be scheduled right now. It requires {CPU: 1.000000} for placement, but this node only has remaining {16.000000/16.000000 CPU, 23.675368 GiB/23.675368 GiB memory, 11.837684 GiB/11.837684 GiB object_store_memory, 1.000000/1.000000 node:192.168.1.69}
. In total there are 0 pending tasks and 1 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increase the resources available to this Ray cluster. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this task or actor because it takes time to install.
(pid=63989) my tf version is 1.15.0
1.15.0
2021-07-15 16:16:21,715 WARNING worker.py:1123 -- The actor or task with ID ffffffffffffffff825c402f93a8f8840670af9e01000000 cannot be scheduled right now. It requires {CPU: 1.000000} for placement, but this node only has remaining {16.000000/16.000000 CPU, 23.675368 GiB/23.675368 GiB memory, 11.837684 GiB/11.837684 GiB object_store_memory, 1.000000/1.000000 node:192.168.1.69}
. In total there are 0 pending tasks and 1 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increase the resources available to this Ray cluster. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this task or actor because it takes time to install.
(pid=64245) my tf version is 2.5.0
2.5.0
tf1 -> tf2
(pid=63989) 1626391032.5891168 1.15.0 made a tensor Tensor("ones:0", shape=(1,), dtype=float32)
(pid=64245) 1626391032.604613 2.5.0 adding one to tensor tf.Tensor([1.], shape=(1,), dtype=float32)
(pid=63989) 1626391032.606403 1.15.0 got result tensor [2.]
tf2 -> tf1
(pid=64245) 1626391037.612582 2.5.0 made a tensor tf.Tensor([1.], shape=(1,), dtype=float32)
(pid=63989) 1626391037.6160698 1.15.0 adding one to tensor Tensor("Const:0", shape=(1,), dtype=float32)
(pid=64245) 1626391037.6215959 2.5.0 got result tensor [2.]
import ray
import time
# In order for anyscale to work, use app config that install nightly wheel
# https://beta.anyscale.com/o/anyscale-internal/projects/prj_6xpDEQ3uKwA2fBKH1bWK7qCW/app-config-details/bld_jjXuwgMFXYQZvkGJ1RpgwjM2
# ray.client("anyscale://runtime-env-demo").connect()
ray.init()
@ray.remote
class TensorflowWorker:
def __init__(self):
import tensorflow as tf
print("my tf version is", tf.__version__)
self.version = tf.__version__
def get_version(self):
return self.version
def call_other(self, other_worker):
# Use the current version of tf
import tensorflow as tf
# Make a tensor, this might be a lazy tensor in tf1, or eager tensor in tf2
tensor = tf.ones([1])
print(time.time(), self.version, "made a tensor", tensor)
# Convert the tensor to numpy array
if self.version == "1.15.0":
arr = tf.Session().run(tensor)
else:
arr = tensor.numpy()
computed = ray.get(other_worker.add_one.remote(arr))
print(time.time(), self.version, "got result tensor", computed)
def add_one(self, arr):
# Use the current version of tf
import tensorflow as tf
tensor = tf.convert_to_tensor(arr)
print(time.time(), self.version, "adding one to tensor", tensor)
tensor = tensor + 1
# Convert the tensor to numpy array
if self.version == "1.15.0":
arr = tf.Session().run(tensor)
else:
arr = tensor.numpy()
return arr
tf1 = TensorflowWorker.options(runtime_env={"pip": ["tensorflow==1.15"]}).remote()
print(ray.get(tf1.get_version.remote()))
tf2 = TensorflowWorker.options(runtime_env={"pip": ["tensorflow==2.5"]}).remote()
print(ray.get(tf2.get_version.remote()))
print()
print("tf1 -> tf2")
ray.get(tf1.call_other.remote(tf2))
print(); time.sleep(5)
print("tf2 -> tf1")
ray.get(tf2.call_other.remote(tf1))
@simon-mo
Copy link
Copy Markdown
Author

Note that the warning message

2021-07-15 16:16:21,715	WARNING worker.py:1123 -- The actor or task with ID ffffffffffffffff825c402f93a8f8840670af9e01000000 cannot be scheduled right now. It requires {CPU: 1.000000} for placement, but this node only has remaining {16.000000/16.000000 CPU, 23.675368 GiB/23.675368 GiB memory, 11.837684 GiB/11.837684 GiB object_store_memory, 1.000000/1.000000 node:192.168.1.69}
. In total there are 0 pending tasks and 1 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increase the resources available to this Ray cluster. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this task or actor because it takes time to install.

is a known issue and will go away as we GA runtime env.

@simon-mo
Copy link
Copy Markdown
Author

Also note that the logs might not be ordered chronologically, hence the timestamp in the print log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment