# Submission¶

Every solution submitted for evaluation must be containerized via Singularity (see this Singularity tutorial).

The submitted Singularity containers will be run by the NIST test and evaluation server using the specified API (see Container API (Round 5)) inside of a virtual machine which has no network capability.

The container submitted to NIST for evaluation must perform trojan detection for a single trained AI model file and output a single probability that the model was poisoned. The test and evaluation infrastructure will iterate over the N models for which your container must predict trojan presence.

Each test data point lacking an output poisoning probability (for example, if you ran out of compute time) will be considered as having probability 0.5 when computing your overall cross entropy loss for the test dataset. Your output logs (see Output Files) will indicate which data-points were missing results and had the default value substituted.

When you submit a container for evaluation, it will be added to a processing queue. When it’s your container’s turn to be run, the container will be copied into a non-networked VM (Ubuntu 18.04 LTS) instance populated with read-only test data and a 1.5TB SATA SSD read/write scratch space drive. The NIST test and evaluation harness will iterate over all elements of the sequestered test dataset and call your container once per data point. Each data point is a trained AI model. Execution will terminate either after the compute time limit or when processing all the test data finishes, whichever is sooner. After your container terminates, NIST will compute the average cross entropy loss between your predictions and the ground truth answers. This score is then posted to the leaderboard website (see Results).

## Container Submission¶

Containers are to be submitted for evaluation by sharing them with a functional NIST Google Drive account (trojai@nist.gov) via a team Google Drive account. If you want multiple people to be able to submit for your team, it might be a good idea to create a new shared Google Drive account that all submitters can access.

1. Package your solution into a Singularity container.
2. Upload your packaged Singularity container to Google Drive using the account you registered with the NIST T&E Team.
• Files from a non-registered email address will be ignored

• container names that start with ‘test’ will be evaluated on the Smoke Test Server

• container names that do not start with ‘test’ will be evaluated on the Evaluation Server

• You can only share 1 file per server. So your Drive account can have up to 2 files shared with the TrojAI Drive account, one starting with ‘test’ and one which does not. The file count restriction allows the servers to be as file name agnostic as possible.

3. Share your container with the TrojAI Google Drive Account
4. Your container is now visible to the NIST trojai Drive account.
• Every few minutes (less than 15 minutes) the test and evaluation server will poll the NIST trojai@nist.gov Google Drive account for new submissions.

5. When your submission is detected, your container will be added to the evaluation queue.
• Your container will run as soon as resources are available after your timeout window has passed. The timeout window is used to rate limit submissions. If you overwrite the submitted container while your previous job is still in the queue (but has not yet been executed), your most recent container will be evaluated instead of the container that existed when the submission was entered into the queue. If the container file_id changes between original submission and execution it will be noted which file was actually executed in the log file.

## Container API (Round 5)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = The path to the model file to be evaluated.

• --cls_token_is_first = Whether the first embedding token should be used as the summary of the text sequence, or the last token.

• --tokenizer_filepath = File path to the pytorch model (.pt) file containing the correct tokenizer to be used with the model_filepath.

• --embedding_filepath = File path to the pytorch model (.pt) file containing the correct embedding to be used with the model_filepath.

• --result_filepath = The path to the output result file where the probability [0, 1] (floating point value in the range of 0 to 1 inclusive) of the aforementioned model file being poisoned is to be written as text (not binary). For example, “0.75”. No other data should be written to this file. If the test server cannot parse your results file, the default probability of 0.5 will be substituted. If any parse errors occur, they will be listed on the leaderboard webpage.

• --scratch_dirpath = The path to a directory (empty folder) where temporary data can be written during the evaluation of the model file.

• --examples_dirpath = The path to a directory containing a few example text files for each of the classes the model is trained to classify. Names are of the format “class_2_example_35.txt”.

## Container API (Round 1-4)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = The path to the model file to be evaluated.

• --result_filepath = The path to the output result file where the probability [0, 1] (floating point value in the range of 0 to 1 inclusive) of the aforementioned model file being poisoned is to be written as text (not binary). For example, “0.75”. No other data should be written to this file. If the test server cannot parse your results file, the default probability of 0.5 will be substituted. If any parse errors occur, they will be listed on the leaderboard webpage.

• --scratch_dirpath = The path to a directory (empty folder) where temporary data can be written during the evaluation of the model file.

• --examples_dirpath = The path to a directory containing a few example png images for each of the classes the model is trained to classify. Names are of the format “class_2_example_35.png”.

## Output Files¶

When your submission is executed, the output logs are uploaded to the TrojAI NIST Google Drive upon completion. That log file is then shared with just your team email (the logs are not posted publicly).

Note: when running on the ES all logging from your Singularity container will be suppressed to prevent data exfiltration from the sequestered dataset.

The log will be named <team name>.sts.log.txt or <team name>.es.log.txt depending on which server the job ran on (Smoke Test Server = STS, or Evaluation Server = ES).

Additionally, the full confusion matrix resulting from sweeping the detection threshold over [0.0, 1.0] with a step of 0.01 will be uploaded and shared with your submitting email. This confusion matrix can be plotted as a ROC curve by plotting FPR on the x axis and TPR on the y axis; and turned into the ROC AUC value using:

import sklean.metrics
roc_auc = sklearn.metrics.auc(FPR, TPR)


The log and confusion files will be overwritten by subsequent submission evaluation runs. So if you want a persistent copy, download and rename the file from Google Drive before your next submission.