Max Fitton mfitton

Overview

In the scenario that we have multiple Ray clusters with processes running on the same node, we want to make sure that each cluster's dashboard only contains metrics from the processes in its cluster. Currently, the reporter process responsible for collecting these metrics (of which there is one per unique (cluster, node) tuple) fetches metrics for all Ray workers on the node, regardless of their cluster.

I assume that we wish to still have a discrete reporter process for each (cluster, node) pair, rather than switching to have a single reporter process per node. It is easier to implement given our current process handling, and it allows us to perform per-cluster configuration of reporting which, although we do not use it now, I think we should aim to support. The downside is that there are certainly metrics that we collect that don't differ at a node level, such as CPU utilization, that would have N processes monitoring them rather than 1.

Proposed Solutions

Of the two solutions, I th

Parallel Ray Tracing Made Simple

Overview

Ray tracing is a method of rendering 3d scenes into 2d images using computer simulation. It is an inherently very parallel problem. In this post I explore how a few lines of code from the Ray framework can allow you to distribute this task among your computer’s CPUs or even among a cluster of machines, allowing us to achieve speed ups roughly linear to the number of cores we employ.

Key Definitions

Before diving into the sample implementation that we will modify to run in parallel, I want to define a couple terms for readers who may not be familiar.

Ray Tracing is a computer graphics technique to generate 2d images from 3d environments by imitating the way that a camera captures photographs. However, while a physical camera takes in light to capture an image of its surroundings, a ray tracer operates the process in reverse. It sends rays out from its “camera,” through a 2d plane whose coordinates correspond to pixels in an image. These rays then may inter

Dim	Time w/o Ray (s)	Time w/ Ray (s)
600x450	48.0	10.8
1200x900	176.7	30.16
1600x1200	291.2	44.9

Fixing the node removed from cluster bug

Repro Script

import ray
from ray.cluster_utils import Cluster

cluster = Cluster()
cluster.add_node()
cluster.add_node()

	import torch
	import ray; ray.init()


	@ray.remote(num_gpus=1)
	def foo():
	a = torch.rand(100000, device='cuda')
	b = torch.rand(100000, device='cuda')
	c = a * b * a * b * a * b * a * b
	return c

	"""
	MIT License

	Copyright (c) 2017 Cyrille Rossant

	Permission is hereby granted, free of charge, to any person obtaining a copy
	of this software and associated documentation files (the "Software"), to deal
	in the Software without restriction, including without limitation the rights
	to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	copies of the Software, and to permit persons to whom the Software is

	for i, x in enumerate(np.linspace(S[0], S[2], w)):
	for j, y in enumerate(np.linspace(S[1], S[3], h)):
	...
	# Loop through initial and secondary rays.
	while depth < depth_max:
	traced = trace_ray(rayO, rayD)

	for i, x in enumerate(np.linspace(S[0], S[2], w)):
	for j, y in enumerate(np.linspace(S[1], S[3], h)):
	...
	# Loop through initial and secondary rays.
	while depth < depth_max:
	traced = trace_ray(rayO, rayD)

	@ray.remote
	def trace_rays_with_bounces(xys):
	results = []
	for (x, y) in xys:
	# ... Snipped, same code as before ...
	results.append(np.clip(col, 0, 1))
	return results

	for i, x in enumerate(np.linspace(S[0], S[2], w)):
	for j, y in enumerate(np.linspace(S[1], S[3], h)):
	coords.append((i, j))
	next_task_xys.append((x, y))
	if len(next_task_xys) == CHUNK_SIZE:
	results.append(
	trace_rays_with_bounces.remote(next_task_xys)
	)
	next_task_xys = []
	if next_task_xys: