Docker Images List and Settings#

Note

See the Glossary for the meaning of the acronyms used in this guide.

Nginx#

Nginx is an open-source web server that serves as a reverse proxy in the Testbed architecture. It receives HTTP requests originating from outside the Testbed network and routes the traffic to the appropriate service.

Command#

Usage: docker run --rm -it IMAGE [OPTIONS]

Options

--ai-lab-host

AI Lab Service host (default: 'restapi')

--ai-lab-port

AI Lab Service port (default: '5000')

--mlflow-tracking-host

AI Lab Service host (default: 'mlflow-tracking')

--mlflow-tracking-port

AI Lab Service port (default: '5000')

--nginx-lab-port

Nginx listening port (default: '30080')

--nginx-mlflow-port

Nginx listening port (default: '35000')

MLFlow Tracking#

The MLFlow Tracking service is an API and UI for logging and querying parameters, metrics, and output files when running your experiments.

Environment Variables#

AWS_ACCESS_KEY_ID

The username for accessing S3 storage. Must match MINIO_ROOT_USER set for the Minio image.

AWS_SECRET_ACCESS_KEY

The password for accessing S3 storage. Must match MINIO_ROOT_PASSWORD set for the Minio image.

MLFLOW_S3_ENDPOINT_URL

The URL endpoint for accessing the S3 storage.

Command#

Usage: docker run --rm -it IMAGE [OPTIONS]

Options

--conda-env

Conda environment (default: 'dioptra')

--backend-store-uri

URI to which to persist experiment and run data. Acceptable URIs are SQLAlchemy-compatible database connection strings (e.g. ‘sqlite:///path/to/file.db’) or local filesystem URIs (e.g. 'file:///absolute/path/to/directory'). (default: 'sqlite:////work/mlruns/mlflow-tracking.db')

--default-artifact-root

Local or S3 URI to store artifacts, for new experiments. Note that this flag does not impact already-created experiments. Default: Within file store, if a file:/ URI is provided. If a sql backend is used, then this option is required. (default: 'file:///work/artifacts')

--gunicorn-opts

Additional command line options forwarded to gunicorn processes. (no default)

--host

The network address to listen on. Use 0.0.0.0 to bind to all addresses if you want to access the tracking server from other machines. (default: '0.0.0.0')

--port

The port to listen on. (default: '5000')

--workers

Number of gunicorn worker processes to handle requests. (default: '4')

REST API#

The REST API service is an API for registering experiments and submitting jobs to the Testbed.

Environment Variables#

DIOPTRA_RESTAPI_DATABASE_URI

The URI to use to connect to the REST API database. (default: '$(pwd)/dioptra.db')

DIOPTRA_RESTAPI_ENV

Selects a set of configurations for the Flask app to use. Must be ‘prod’, ‘dev’, or ‘test’. (default: 'prod')

DIOPTRA_DEPLOY_SECRET_KEY

Secret key used by Flask to sign cookies. While cookies are not used when accessing the REST API, per best practices this should still be changed to a long, random value. (default: 'deploy123')

AWS_ACCESS_KEY_ID

The username for accessing S3 storage. Must match MINIO_ROOT_USER set for the Minio image.

AWS_SECRET_ACCESS_KEY

The password for accessing S3 storage. Must match MINIO_ROOT_PASSWORD set for the Minio image.

MLFLOW_TRACKING_URI

The URI to use for connecting to the MLFlow Tracking service.

MLFLOW_S3_ENDPOINT_URL

The URL endpoint for accessing the S3 storage.

RQ_REDIS_URI

The redis:// URI to the Redis queue.

Command#

Usage: docker run --rm -it IMAGE [OPTIONS]

Options

--app-module

Application module (default: 'wsgi:app')

--backend

Server backend (default: 'gunicorn')

--conda-env

Conda environment (default: 'dioptra')

--gunicorn-module

Python module used to start Gunicorn WSGI server (default: 'dioptra.restapi.cli.gunicorn')

Workers (PyTorch/Tensorflow)#

A Testbed Worker is a managed process within a Docker container that watches one or more Redis Queues for new jobs to handle. The Testbed Workers come in different flavors, with each one provisioned to support running jobs for different types of machine learning libraries.

Environment Variables#

DIOPTRA_PLUGIN_DIR

Directory to use for syncing the task plugins. (default: '/work/plugins')

DIOPTRA_PLUGINS_S3_URI

The S3 URI to the directory containing the builtin plugins

DIOPTRA_RESTAPI_DATABASE_URI

The URI to use to connect to the REST API database. (default: '$(pwd)/dioptra.db')

AWS_ACCESS_KEY_ID

The username for accessing S3 storage. Must match MINIO_ROOT_USER set for the Minio image.

AWS_SECRET_ACCESS_KEY

The password for accessing S3 storage. Must match MINIO_ROOT_PASSWORD set for the Minio image.

MLFLOW_TRACKING_URI

The URI to use for connecting to the MLFlow Tracking service.

MLFLOW_S3_ENDPOINT_URL

The URL endpoint for accessing the S3 storage.

RQ_REDIS_URI

The redis:// URI to the Redis queue.

Command#

Usage: docker run --rm -it IMAGE [OPTIONS] [ARGS]...

Positional Arguments

...

Queues to watch

Options

--conda-env

Conda environment (default: 'dioptra')

--results-ttl

Job results will be kept for this number of seconds (default: '500')

--rq-worker-module

Python module used to start the RQ Worker (default: 'dioptra.rq.cli.rq')

Minio#

Vendor image: https://hub.docker.com/r/minio/minio/

The Minio service provides distributed, S3-compatible storage for the Testbed.

Command#

Usage: docker run --rm -it IMAGE server [ARGS]...

Positional Arguments

...

A list of paths to data storage locations. For a single machine deployment, the path should point to a bind mounted or docker volume directory, e.g. /data. For a distributed deployment, pass a list of URLs instead, e.g. http://minio{1...4}/data{1...2}. The ellipses syntax {1...4} expands into a list of URLs at runtime.

Environment Variables#

MINIO_ROOT_USER

Sets the username for logging into the Minio deployment.

MINIO_ROOT_PASSWORD

Sets the password for logging into the Minio deployment.

Redis#

Vendor image: https://hub.docker.com/_/redis

The Redis service is a fast in-memory database that is leveraged as a centralized queue to delegate jobs to the Testbed Workers.

Command#

Usage: docker run --rm -it IMAGE redis-server [OPTIONS]

Options

--appendonly

Persist data using an append only file. Accepts 'yes' or 'no'. (default: 'no')

--appendfilename

The name of the append only file. (default: 'appendonly.aof')