Design Patterns, Anti-patterns and Best Practices
Ray has a myriad of scaling design patterns (opens in a new tab) for tasks & actors, including the Tree of Actors Pattern discussed earlier.
Some others Patterns include:
- Nested tasks to achieve nested parallelism
- Generators to reduce heap memory usage
ray.wait
to limit the number of pending tasks- Resources to limit the number of concurrently running tasks
- Asyncio to run actor methods concurrently
- An actor to synchronize other tasks and actors
- A supervisor actor to manage a tree of actors
- Pipelining to increase throughput
Ray also has a myriad of anti-patterns (opens in a new tab) to avoid, including:
- Returning
ray.put()
ObjectRefs from a task harms performance and fault tolerance - Calling
ray.get()
in a loop harms parallelism - Calling
ray.get()
unnecessarily harms performance - Processing results in submission order using
ray.get()
increases runtime - Fetching too many objects at once with
ray.get()
causes failure - Over-parallelizing with too fine-grained tasks harms speedup
- Redefining the same remote function or class harms performance
- Passing the same large argument by value repeatedly harms performance
- Closure capturing large objects harms performance
- Using global variables to share state between tasks and actors
Some Best Practices
Because Ray's core APIs are simple and flexible, first time users can trip upon certain API calls in Ray's usage patterns. This short tips & tricks will insure you against unexpected results. Below we briefly explore a handful of API calls and their best practices.
Numpy
TLDR: Use Numpy as much as possible.
Since Ray processes do not share memory space, data transferred between workers and nodes will need to serialized and deserialized. Ray uses the Plasma object store (opens in a new tab) to efficiently transfer objects across different processes and different nodes. Numpy arrays in the object store are shared between workers on the same node (zero-copy deserialization). Ray optimizes for numpy arrays by using Pickle protocol 5 with out-of-band data. Whenever possible, use numpy arrays or Python collections of numpy arrays for maximum performance.
Use @ray.remote
and @ray.method
to return multiple arguments
Often, you may wish to return more than a single argument from a Ray Task, or return more than a single value from an Ray Actor's method. Let's look at some examples how you do it.
@ray.remote(num_returns=3)
def tuple3(id: str, lst: List[float]) -> Tuple[str, int, float]:
one = id.capitalize()
two = random.randint(5, 10)
three = sum(lst)
return (one, two, three)
# Return three object references with three distinct values in each
x_ref, y_ref, z_ref = tuple3.remote("ray rocks!", [2.2, 4.4, 6.6])
# Fetch the list of references
x, y, z = ray.get([x_ref, y_ref, z_ref])
print(f'{x}, {y}, {z:.2f}')
A slight variation of the above example is pack all values in a single return, and then unpack them.
@ray.remote(num_returns=1)
def tuple3_packed(id: str, lst: List[float]) -> Tuple[str, int, float]:
one = id.capitalize()
two = random.randint(5, 10)
three = sum(lst)
return (one, two, three)
# Returns one object references with three values in it
xyz_ref = tuple3_packed.remote("ray rocks!", [2.2, 4.4, 6.6])
# Fetch from a single object ref and unpack into three values
x, y, z = ray.get(xyz_ref)
print(f'({x}, {y}, {z:.2f})')
Let's do the same for an Ray actor method, except here we are using a decorator @ray.method(num_returns=3)
to decorate a Ray actor's method:
@ray.remote
class TupleActor:
@ray.method(num_returns=3)
def tuple3(self, id: str, lst: List[float]) -> Tuple[str, int, float]:
one = id.capitalize()
two = random.randint(5, 10)
three = sum(lst)
return (one, two, three)
# Create an instance of an actor
actor = TupleActor.remote()
x_ref, y_ref, z_ref = actor.tuple3.remote("ray rocks!", [2.2, 4.4, 5.5])
x, y, z = ray.get([x_ref, y_ref, z_ref])
print(f'({x}, {y}, {z:.2f})')
Resources:
There is a advanced and comprehensive list of all Ray design patterns and anti-design patterns (opens in a new tab) you can explore at after the class at home.
Additional resources on best practices: