Cloud & MLOps ☁️
Deployment
Custom Images

Custom Docker Images

How do we to package and serve our own code inside a custom Docker image? We will need to:

  • Extend the official Ray Docker images with your own dependencies
  • Package your Serve application in a custom Docker image instead of a runtime_env
  • Use custom Docker images with KubeRay

Extending the Ray Docker image

The rayproject (opens in a new tab) organization maintains Docker images with dependencies needed to run Ray. In fact, the rayproject/ray (opens in a new tab) and rayproject/ray-ml (opens in a new tab) repos host Docker images for this doc. For instance, this RayService config (opens in a new tab) uses the rayproject/ray:2.5.0 (opens in a new tab) image hosted by rayproject/ray.

You can extend these images and add your own dependencies to them by using them as a base layer in a Dockerfile. In general, the rayproject/ray images contain only the dependencies needed to import Ray and the Ray libraries. The rayproject/ray-ml images contain additional dependencies (e.g., PyTorch, HuggingFace, etc.) that are useful for machine learning. You can extend images from either of these repos to build your custom images.

Say we had a demo Serve application with dependencies like pydantic and fastapi:

main.py
from ray import serve
from fastapi import FastAPI
from pydantic import BaseModel
 
class PnLItem(BaseModel):
    pnl: float
 
# Define a FastAPI app & wrap it in a deployment with a route handler.
app = FastAPI(
    title="Test API",
    description="API Test",
    version="0.1.0"
)
 
@serve.deployment
@serve.ingress(app)
class TestAPI:
    @app.get("/")
    async def root(self):
        return "Welcome to the API"
 
    @app.post("/double")
    async def double(self, request: PnLItem):
        pnl_value = request.pnl
        return {"pnl_double": pnl_value*2}
 
app = TestAPI.bind()

We can create a Dockerfile that extends the rayproject/ray image. Use the WORKDIR (opens in a new tab) and COPY (opens in a new tab) commands inside the Dockerfile to install the Serve application in the image, and add the dependencies needed to run this application:

# pull official base image
FROM rayproject/ray:2.7.0
 
# set work directory
WORKDIR /serve_app
 
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
 
# copy requirements file
COPY main.py /serve_app/main.py
 
# install dependencies
RUN pip install --upgrade pip setuptools wheel \
    && pip install fastapi==0.103.1 pydantic==1.10.13

Then, you can build this image and push it to Dockerhub, so it can be pulled in the future.

docker build . -t shavvimal/custom_ray:latest
docker image push shavvimal/custom_ray:latest

Using custom Docker images in KubeRay

KubeRay starts Ray with the ray start command inside the WORKDIR directory defined in the Dockerfile. All the Ray Serve actors are then able to import any dependencies in the directory. By COPYing the Serve file into the WORKDIR, the Serve deployments have access to the Serve code without needing a runtime_env.. You can also add any other dependencies needed for your Serve app to the WORKDIR directory.

Run these custom Docker images in KubeRay by adding them to the RayService config. Make the following changes:

  1. Set the rayVersion in the rayClusterConfig to the Ray version used in your custom Docker image.
  2. Set the ray-head container's image to the custom image's name on Dockerhub.
  3. Set the ray-worker container's image to the custom image's name on Dockerhub.
  4. Update the serveConfigV2 field to remove any runtime_env dependencies that are in the container.
apiVersion: ray.io/v1alpha1
kind: RayService
metadata:
  name: rayservice-dummy
spec:
  serviceUnhealthySecondThreshold: 300
  deploymentUnhealthySecondThreshold: 300
  serveConfigV2: |
    applications:
      - name: app1
        route_prefix: /
        import_path: main:app
        deployments:
          - name: TestAPI
  rayClusterConfig:
    rayVersion: "2.7.0" # Should match Ray version in the containers
    headGroupSpec:
      rayStartParams:
        dashboard-host: "0.0.0.0"
      template:
        spec:
          containers:
            - name: ray-head
              image: shavvimal/custom_ray:latest
              resources:
                limits:
                  cpu: 2
                  memory: 2Gi
                requests:
                  cpu: 2
                  memory: 2Gi
              ports:
                - containerPort: 6379
                  name: gcs-server
                - containerPort: 8265 # Ray dashboard
                  name: dashboard
                - containerPort: 10001
                  name: client
                - containerPort: 8000
                  name: serve
    workerGroupSpecs:
      - replicas: 1
        minReplicas: 1
        maxReplicas: 1
        groupName: small-group
        rayStartParams: {}
        template:
          spec:
            containers:
              - name: ray-worker
                image: shavvimal/custom_ray:latest
                lifecycle:
                  preStop:
                    exec:
                      command: ["/bin/sh", "-c", "ray stop"]
                resources:
                  limits:
                    cpu: "1"
                    memory: "2Gi"
                  requests:
                    cpu: "500m"
                    memory: "2Gi"

Remember that you can use

serve build app.main:app -o config/serve-deployment.yaml

To generate the serveConfigV2 section of the YAML.

Adding Environmental Variables

See Define Environment Variables for a Container (opens in a new tab) in the kubenetes docs. In general,you need to pass environment variable to Pod's container not deployment. So you need to add env to the spec.template.spec.containers section of the rayClusterConfig:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:1.7.9
          env:
            - name: MY_VAT
              value: MY_VALUE
            - name: DEMO_GREETING
              value: "Hello from the environment"
            - name: DEMO_FAREWELL
              value: "Such a sweet sorrow"
          ports:
            - containerPort: 80

Running Local Docker Images

Sometimes you may not want to push an image to dockerHub and instead want to run it locally with a pod based on an image that you just build on your PC. To do this, you will need to do 2 things: Make sure our Kubernetes cluster can access the image, and then set the imagePullPolicy to Never in the rayClusterConfig.

If you use kind as a local Kubernetes playground, Docker images can be loaded into your cluster nodes with:

kind load docker-image my-custom-image-0 my-custom-image-1

For example, if i had just built the custom_ray:latest image, Instead of pushing up, I can do:

kind load docker-image custom_ray:latest

You can get a list of images present on a cluster node by using docker exec

docker exec -it my-node-name crictl images

Where my-node-name is the name of the Docker container (e.g. kind-control-plane). The Kubernetes default pull policy is IfNotPresent unless the image tag is :latest or omitted (and implicitly :latest) in which case the default policy is Always. IfNotPresent causes the Kubelet to skip pulling an image if it already exists. If you want those images loaded into node to work as expected, we need to either not use a :latest tag, and / or specify imagePullPolicy: IfNotPresent or imagePullPolicy: Never the container(s).#

In the end, the adjusted YAML from above will look like:

apiVersion: ray.io/v1alpha1
kind: RayService
metadata:
  name: rayservice-dummy
spec:
  serviceUnhealthySecondThreshold: 300
  deploymentUnhealthySecondThreshold: 300
  serveConfigV2: |
    applications:
      - name: app1
        route_prefix: /
        import_path: main:app
        deployments:
          - name: TestAPI
  rayClusterConfig:
    rayVersion: "2.7.0" # Should match Ray version in the containers
    headGroupSpec:
      rayStartParams:
        dashboard-host: "0.0.0.0"
      template:
        spec:
          containers:
            - name: ray-head
              image: custom_ray:latest
              imagePullPolicy: Never
              resources:
                limits:
                  cpu: 2
                  memory: 2Gi
                requests:
                  cpu: 2
                  memory: 2Gi
              ports:
                - containerPort: 6379
                  name: gcs-server
                - containerPort: 8265 # Ray dashboard
                  name: dashboard
                - containerPort: 10001
                  name: client
                - containerPort: 8000
                  name: serve
    workerGroupSpecs:
      - replicas: 1
        minReplicas: 1
        maxReplicas: 1
        groupName: small-group
        rayStartParams: {}
        template:
          spec:
            containers:
              - name: ray-worker
                image: custom_ray:latest
                imagePullPolicy: Never
                lifecycle:
                  preStop:
                    exec:
                      command: ["/bin/sh", "-c", "ray stop"]
                resources:
                  limits:
                    cpu: "1"
                    memory: "2Gi"
                  requests:
                    cpu: "500m"
                    memory: "2Gi"

Resources: