Ray use cases
Now that we have a sense of what Ray is in theory, it's important to discuss what makes Ray so useful in practice. On this page, we will encounter the ways that individuals, organizations, and companies leverage Ray to build their AI applications.
First, let's explore how Ray scales common ML workloads. Then, we will look at some advanced implementations.
Scaling common ML workloads
Parallel training of many models
When any given model you want to train can fit on a single GPU, then Ray can assign each training run to a separate Ray Task. In this way, all available workers are utilized to run independent remote training rather than one worker running jobs sequentially.
Distributed training of large models
In contrast to training many models, model parallelism partitions a large model across many machines for training. Ray Train has built-in abstractions for distributing shards of models and running training in parallel.
Managing parallel hyperparameter tuning experiments
Running multiple hyperparameter tuning experiments is a pattern apt for distributed computing because each experiment is independent of one another. Ray Tune handles the hard bit of distributing hyperparameter optimization and makes available key features such as checkpointing the best result, optimizing scheduling, specifying search patterns, and more.
Reinforcement learning
Ray RLlib offers support for production-level, distributed reinforcement learning workloads while maintaining unified and simple APIs for a large variety of industry applications. Below, you can see decentralized distributed proximal polixy optimiation (DD-PPO) architecture, supported by Ray RLLib, where sampling and training are done on worker GPUs
![Decentralized distributed proximal polixy optimiation (DD-PPO) architecture, supported by Ray RLLib, where sampling and training are done on worker GPUs](/img/ray/rllib_use_case.png" width="70%" loadi)
Batch inference on CPUs and GPUs
Performing inference on incoming batches of data can be parallelized by exporting the architecture and weights of a trained model to the shared object store. Then, using these model replicas, Ray scales predictions on batches across workers using Ray AIR's BatchPredictor
for batch inference:
Multi-model composition for model serving
Ray Serve (opens in a new tab) supports complex model deployment patterns (opens in a new tab) requiring the orchestration of multiple Ray actors, where different actors provide inference for different models. Serve handles both batch and online inference and can scale to thousands of models in production.
ML platform
Merlin (opens in a new tab) is Shopify's ML platform built on Ray. It enables fast-iteration and scaling of distributed applications (opens in a new tab) such as product categorization and recommendations.
Spotify uses Ray for advanced applications (opens in a new tab) that include personalizing content recommendations for home podcasts and personalizing Spotify Radio track sequencing.
Implementing advanced ML workloads
Alpa - training very large models with Ray
Alpa (opens in a new tab) is a Ray-native library that automatically partitions, schedules, and executes the training and serving computation of very large deep learning models (opens in a new tab) on hundreds of GPUs.
Above you can see parallelization plans for a computational graph from Alpa. A, B, C, and D are operators. Each color represents a different device (i.e. GPU) executing a partition or the full operator leveraging Ray actors
Exoshuffle - large scale data shuffling
In Ray 2.0, Exoshuffle (opens in a new tab) is integrated with the Ray Data to provide an application level shuffle system that outperforms Spark (opens in a new tab) and achieves 82% of theoretical performance on a 100TB sort on 100 nodes.
Hamilton - feature engineering
Hamilton (opens in a new tab) is an open source dataflow micro-framework that manages feature engineering for time series forecasting. Developed at StitchFix (opens in a new tab), this library provides scalable parallelism, where each Hamilton function is distributed and data is limited by machine memory.
Riot Games - reinforcement learning
Riot Games uses Ray to build bots (opens in a new tab) that play new battles at various skill levels. These reinforcement learning agents provide additional signals to their designers to deliver the best experiences for players.