Configure GPU Workers#

This guide explains how to assign one or more GPUs to Dioptra worker containers for GPU-accelerated machine learning workloads.

Prerequisites#

Note

GPU workers are configured during template creation using the num_tensorflow_gpu_workers and num_pytorch_gpu_workers variables. By default, each GPU worker is assigned a dedicated GPU.

GPU Configuration#

Step 1: Verify GPU Availability#

Verify that the host machine has GPUs available and the NVIDIA drivers are installed:

nvidia-smi

This should display information about your GPU(s). Note the GPU indices (0, 1, 2, etc.) for use in configuration.

Step 2: Assign GPUs to Workers#

To customize GPU assignments beyond the default one-GPU-per-worker configuration, open docker-compose.override.yml and add blocks for the GPU worker containers. GPU worker names include tfgpu or pytorchgpu.

Note

In the examples below, replace <deployment-name> with your deployment’s slugified name (default: dioptra-deployment).

To assign specific GPUs to a worker:

services:
  <deployment-name>-tfgpu-01:
    environment:
      NVIDIA_VISIBLE_DEVICES: 0

  <deployment-name>-pytorchgpu-01:
    environment:
      NVIDIA_VISIBLE_DEVICES: 1

To assign multiple GPUs to a single worker:

services:
  <deployment-name>-tfgpu-01:
    environment:
      NVIDIA_VISIBLE_DEVICES: 0,1

To allow a worker to use all available GPUs:

services:
  <deployment-name>-tfgpu-01:
    environment:
      NVIDIA_VISIBLE_DEVICES: all

Warning

The combined number of TensorFlow and PyTorch GPU workers cannot be greater than the number of GPUs available on the host machine (unless you assign the same GPU to multiple workers, which may cause resource contention).

Step 3: Restart the Deployment#

Apply your changes by restarting the deployment:

docker compose down
docker compose up -d

Verify that the GPU workers started correctly:

docker compose ps

See Also#