Mount Data Volumes#

This guide explains how to mount data volumes (host directories or NFS shares) into Dioptra worker containers for accessing datasets and other artifacts.

Prerequisites#

Overview#

The docker-compose.yml file generated by the cookiecutter template supports mounting a single datasets directory from the host machine into worker containers via the datasets_directory variable. For more advanced configurations, use the docker-compose.override.yml file.

Common reasons for mounting additional folders:

  1. Your datasets are stored in a folder on your host machine or in an NFS share

  2. You want to make other artifacts available to the worker containers, such as pre-trained models

Option A: Mount a Host Directory#

Step A1: Verify Directory Permissions#

Ensure the folder and all of its files are world-readable:

find <host-data-path> -type d -print0 | xargs -0 chmod o=rx
find <host-data-path> -type f -print0 | xargs -0 chmod o=r

Note

Replace <host-data-path> with the absolute path to your data directory on the host machine (e.g., /home/data, /mnt/datasets).

Step A2: Add Volume Mount to Workers#

Open docker-compose.override.yml in a text editor and add a block for each worker container that needs access to the data. Worker container names include tfcpu, tfgpu, pytorchcpu, or pytorchgpu.

This example mounts the host data directory to /dioptra/data in the container as read-only:

services:
  <deployment-name>-tfcpu-01:
    volumes:
      - "<host-data-path>:/dioptra/data:ro"

Note

Replace <deployment-name> with your deployment’s slugified name (default: dioptra-deployment) and <host-data-path> with the absolute path to your data directory.

Repeat for each worker container that needs access to the data.

Option B: Mount an NFS Share#

Step B1: Define the NFS Volume#

Open docker-compose.override.yml and add a top-level volumes: section (not nested under services:) with a named NFS volume definition:

volumes:
  dioptra-data:
    driver: local
    driver_opts:
      type: nfs
      o: "addr=<nfs-server-ip>,auto,rw,bg,nfsvers=4,intr,actimeo=1800"
      device: ":<exported-directory>"

Note

Replace <nfs-server-ip> with your NFS server’s IP address and <exported-directory> with the path to the exported directory on the NFS server.

Step B2: Ensure Files Are World-Readable#

Worker containers run as a non-root user and require read access to the data files. The files and directories on the NFS share must be world-readable (o=r for files, o=rx for directories).

How you set these permissions depends on your access level:

If you have shell access to the NFS server:

Run the chmod commands directly on the server where the files are stored:

find <nfs-export-path> -type d -print0 | xargs -0 chmod o=rx
find <nfs-export-path> -type f -print0 | xargs -0 chmod o=r

Note

Replace <nfs-export-path> with the path to the exported directory on the NFS server (e.g., /srv/nfs/dioptra-data).

If the NFS share is mounted on a system where you have write access:

Run the chmod commands on the mounted path:

find <nfs-mount-path> -type d -print0 | xargs -0 chmod o=rx
find <nfs-mount-path> -type f -print0 | xargs -0 chmod o=r

Note

Replace <nfs-mount-path> with the local mount point of the NFS share (e.g., /mnt/nfs-data).

If you do not have write access to the files:

Coordinate with your system administrator to set the appropriate permissions on the NFS share.

Step B3: Add Volume Mount to Workers#

Add service blocks for each worker container that needs access to the data, using the named volume:

services:
  <deployment-name>-tfcpu-01:
    volumes:
      - "dioptra-data:/dioptra/data:ro"

Note

Replace <deployment-name> with your deployment’s slugified name (default: dioptra-deployment).

The :ro suffix mounts the NFS share as read-only to prevent jobs from accidentally modifying or deleting data.

Repeat for each worker container that needs access to the data.

See Also#