# Submission¶

Every solution submitted for evaluation must be containerized via Singularity (see this Singularity tutorial).

The submitted Singularity containers will be run by the NIST test and evaluation server using the specified API (see Container API (Round 10)) inside of a virtual machine which has no network capability.

The container submitted to NIST for evaluation must perform trojan detection for a single trained AI model file and output a single probability that the model was poisoned. The test and evaluation infrastructure will iterate over the N models for which your container must predict trojan presence.

Each test data point lacking an output poisoning probability (for example, if you ran out of compute time) will be considered as having probability 0.5 when computing your overall cross entropy loss for the test dataset. Your output logs (see Output Files) will indicate which data-points were missing results and had the default value substituted.

When you submit a container for evaluation, it will be added to a processing queue. When it’s your container’s turn to be run, the container will be copied into a non-networked VM (Ubuntu 20.04 LTS) instance populated with read-only test data and a 1.5TB SATA SSD read/write scratch space drive. The NIST test and evaluation harness will iterate over all elements of the sequestered test dataset and call your container once per data point. Each data point is a trained AI model. Execution will terminate either after the compute time limit or when processing all the test data finishes, whichever is sooner. After your container terminates, NIST will compute the average cross entropy loss between your predictions and the ground truth answers. This score is then posted to the leaderboard website (see Results).

## Container Submission¶

Containers are to be submitted for evaluation by sharing them with a functional NIST Google Drive account (trojai@nist.gov) via a team Google Drive account. If you want multiple people to be able to submit for your team, it might be a good idea to create a new shared Google Drive account that all submitters can access.

1. Package your solution into a Singularity container.
2. Upload your packaged Singularity container to Google Drive using the account you registered with the NIST T&E Team.
• Files from a non-registered email address will be ignored

• Leaderboard name is determined from the first tab on pages.nist.gov/trojai

• Data split is in a second tab above the Teams/Jobs table

• Required filename formats for each leaderboard and data split is shown above each Teams/Jobs table.

• You can only share 1 file per leaderboard and data split

4. Your container is now visible to the NIST trojai Drive account.
• Every few minutes (less than 15 minutes) the test and evaluation server will poll the NIST trojai@nist.gov Google Drive account for new submissions.

• Your container will run as soon as resources are available after your timeout window has passed. The timeout window is used to rate limit submissions. If you overwrite the submitted container while your previous job is still in the queue (but has not yet been executed), your most recent container will be evaluated instead of the container that existed when the submission was entered into the queue. If the container file_id changes between original submission and execution it will be noted which file was actually executed in the log file.

## Configuration¶

Beginning with Round 9, containers must expose their internal parameters. This requires the implementation of two features.

Containers must load their parameters from configuration files.

One of these files is a metaparameters JSON file, which is meant to contain parameters which are tuned but not learned. The contents of this file must be described by a JSON schema file.

The metaparameters file should contain the parameters for which effort was spent to tune, or which the developers intend to be tunable. The goal is to expose these parameters so that sensitivity analysis can be performed on them, and so that other users of the container could theoretically adjust them to tune the performance for their own data.

Containers may also load parameters from files in a learned parameters directory, which are meant to contain parameters that are learned in the configure mode, as described below.

In addition, containers must include the following files which store default values of parameters, as described in the Configuration section. The parameter values in these files are what will be used during testing and evaluation. Note that containers should load these files from the paths given on the command line, not from these locations.

• /metaparameters.json = Parameters which are tuned but not learned.

• /metaparameters_schema.json = Schema describing metaparameters.json.

• /learned_parameters/ = Directory containing arbitrary files with learned parameters.

Requirements for metaparameters_schema.json and metaparameters.json:

1. All values in schema must have bounds:

• strings use enum to specify valid options

• integer and number use minimum and maximum

1. Must include the follow keys in the root of the schema:

• $schema • technique • technique_description • technique_changes • technique_type • commit_id • repo_name The technique_type should be a list with one or more of the following options: Weight Analysis, Trigger Inversion, Attribution Analysis, Jacobian Inspection, Other. If your technique type is missing from these options (or if you use Other), please let us know so that we can include any new techniques types. 1. All parameters in metaparameters.json must be present in the metaparameters_schema.json. To ensure this check is done during testing, please include the additionalProperties: false in your configuration files. You will need to also include this in any objects in $defs if that is used.

2. If a parameter is specified in the metaparameters_schema.json then it should also be used in the metaparameters.json.

We have developed a jsonschema checker utility that is executed on your containers prior to running them. Please use it prior to submitting to the leaderboard to verify that the container will pass our checks. https://github.com/usnistgov/trojai-test-harness/blob/master/actor_executor/jsonschema_checker.py

For a working example please see our trojai-example repo: https://github.com/usnistgov/trojai-example

https://github.com/usnistgov/trojai-example/blob/master/metaparameters_schema.json

https://github.com/usnistgov/trojai-example/blob/master/metaparameters.json

Description of required fields: - technique: A text label that indicates what specific trojan detection technique is being used within the submitted container. Keep this the same across the same general detection technique/ pipeline to allow T&E to track progress for a specific technique. This will be used for sorting different techniques that are submitted by the same team. Eg. “K-Arm Bandit Detection Pipeline”

• technique_description: Short description of how the technique works and what it leverages for trojan detection. Eg. “This technique uses differences in the model attention weights to determine whether a model is trojaned.”

• technique_changes: Short blurb indicating what changes were made from the previous container submitted with the same technique label. Please be specific. Eg. “The technique was updated to allow for better initialization of the triggers for reverse engineering via applying _______ algorithm”

• commit_id: The Git commit ID of the codebase used by the container. If this does not work for your workflow please message trojai@nist.gov directly to discuss. Eg. 5b078e7fa98ac4c3d1393376d8db1ffa4e976e52

• repo_name: The text name for the repository. For instance URL to github repo; i.e., https://github.com/usnistgov/trojai-example

### Configure Mode¶

Containers must implement a configure mode.

When this mode is active, containers will not detect Trojans. Instead, they will use a provided set of training models to learn/update the set of parameters that will allow the container to detect Trojans in similar models. These parameters will be written to files in the learned parameters directory.

There are two purposes for this mode. The first is that some of the tunable metaparameters may control the way in which the learned parameters are generated, so the learning process must be re-run for these metaparameters to be applied.

The second purpose is that, eventually, Trojan detectors may be completely self-configuring, requiring users to not manually tune any parameters, but to only provide examples of the models on which the detectors will be run. Although we do not expect current submissions to have self-configuration which is this robust, this feature is a step towards this goal.

## Container API (Round 10)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = File path to the pytorch model file to be evaluated.

• --source_dataset_dirpath = File path to a directory containing the original clean dataset into which triggers were injected during training.

• --features_filepath = File path to the file where intermediate detector features may be written. After execution this csv file should contain a two rows, the first row contains the feature names (you should be consistent across your detectors), the second row contains the value for each of the column names.

• --result_filepath = File path to the file where output result should be written. After execution this file should contain a single line with a single floating point trojan probability.

• --scratch_dirpath = File path to the folder where scratch disk space exists. This folder will be empty at execution start and will be deleted at completion of execution.

• --examples_dirpath = File path to the directory containing json file(s) that contains the examples which might be useful for determining whether a model is poisoned.

• --round_training_dataset_dirpath = File path to the directory containing id-xxxxxxxx models of the current rounds training dataset.

• --metaparameters_filepath = Path to JSON file containing values of tunable paramaters to be used when evaluating models.

• --schema_filepath = Path to a schema file in JSON Schema format against which to validate the config file.

• --learned_parameters_dirpath = Path to a directory containing parameter data (model weights, etc.) to be used when evaluating models. If --configure_mode is set, these will instead be overwritten with the newly-configured parameters.

• --configure_mode = Instead of detecting Trojans, set values of tunable parameters and write them to a given location.

• --configure_models_dirpath = Path to a directory containing models to use when in configure mode.

In addition, containers must include the following files which store default values of parameters, as described in the Configuration section. The parameter values in these files are what will be used during testing and evaluation. Note that containers should load these files from the paths given on the command line, not from these locations.

• /metaparameters.json = Parameters which are tuned but not learned.

• /metaparameters_schema.json = Schema describing metaparameters.json.

• /learned_parameters/ = Directory containing arbitrary files with learned parameters.

## Container API (Round 9)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = File path to the pytorch model file to be evaluated.

• --tokenizer_filepath = File path to the pytorch model (.pt) file containing the correct tokenizer to be used with the model_filepath.

• --features_filepath = File path to the file where intermediate detector features may be written. After execution this csv file should contain a two rows, the first row contains the feature names (you should be consistent across your detectors), the second row contains the value for each of the column names.

• --result_filepath = File path to the file where output result should be written. After execution this file should contain a single line with a single floating point trojan probability.

• --scratch_dirpath = File path to the folder where scratch disk space exists. This folder will be empty at execution start and will be deleted at completion of execution.

• --examples_dirpath = File path to the directory containing json file(s) that contains the examples which might be useful for determining whether a model is poisoned.

• --round_training_dataset_dirpath = File path to the directory containing id-xxxxxxxx models of the current rounds training dataset.

• --metaparameters_filepath = Path to JSON file containing values of tunable paramaters to be used when evaluating models.

• --schema_filepath = Path to a schema file in JSON Schema format against which to validate the config file.

• --learned_parameters_dirpath = Path to a directory containing parameter data (model weights, etc.) to be used when evaluating models. If --configure_mode is set, these will instead be overwritten with the newly-configured parameters.

• --configure_mode = Instead of detecting Trojans, set values of tunable parameters and write them to a given location.

• --configure_models_dirpath = Path to a directory containing models to use when in configure mode.

In addition, containers must include the following files which store default values of parameters, as described in the Configuration section. The parameter values in these files are what will be used during testing and evaluation. Note that containers should load these files from the paths given on the command line, not from these locations.

• /metaparameters.json = Parameters which are tuned but not learned.

• /metaparameters_schema.json = Schema describing metaparameters.json.

• /learned_parameters/ = Directory containing arbitrary files with learned parameters.

## Container API (Round 5-8)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = The path to the model file to be evaluated.

• --cls_token_is_first = Whether the first embedding token should be used as the summary of the text sequence, or the last token.

• --tokenizer_filepath = File path to the pytorch model (.pt) file containing the correct tokenizer to be used with the model_filepath.

• --embedding_filepath = File path to the pytorch model (.pt) file containing the correct embedding to be used with the model_filepath.

• --result_filepath = The path to the output result file where the probability [0, 1] (floating point value in the range of 0 to 1 inclusive) of the aforementioned model file being poisoned is to be written as text (not binary). For example, “0.75”. No other data should be written to this file. If the test server cannot parse your results file, the default probability of 0.5 will be substituted. If any parse errors occur, they will be listed on the leaderboard webpage.

• --scratch_dirpath = The path to a directory (empty folder) where temporary data can be written during the evaluation of the model file.

• --examples_dirpath = The path to a directory containing a few example text files for each of the classes the model is trained to classify. Names are of the format “class_2_example_35.txt”.

## Container API (Round 1-4)¶

Containers submitted to the NIST Test and Evaluation server will be launched using the following API.

• --model_filepath = The path to the model file to be evaluated.

• --result_filepath = The path to the output result file where the probability [0, 1] (floating point value in the range of 0 to 1 inclusive) of the aforementioned model file being poisoned is to be written as text (not binary). For example, “0.75”. No other data should be written to this file. If the test server cannot parse your results file, the default probability of 0.5 will be substituted. If any parse errors occur, they will be listed on the leaderboard webpage.

• --scratch_dirpath = The path to a directory (empty folder) where temporary data can be written during the evaluation of the model file.

• --examples_dirpath = The path to a directory containing a few example png images for each of the classes the model is trained to classify. Names are of the format “class_2_example_35.png”.

## Output Files¶

When your submission is executed, the output logs are uploaded to the TrojAI NIST Google Drive upon completion. That log file is then shared with just your team email (the logs are not posted publicly).

Note: when running on the ES all logging from your Singularity container will be suppressed to prevent data exfiltration from the sequestered dataset.

The log will be named <team name>.sts.log.txt or <team name>.es.log.txt depending on which server the job ran on (Smoke Test Server = STS, or Evaluation Server = ES).

Additionally, the full confusion matrix resulting from sweeping the detection threshold over [0.0, 1.0] with a step of 0.01 will be uploaded and shared with your submitting email. This confusion matrix can be plotted as a ROC curve by plotting FPR on the x axis and TPR on the y axis; and turned into the ROC AUC value using:

import sklean.metrics
roc_auc = sklearn.metrics.auc(FPR, TPR)


The log and confusion files will be overwritten by subsequent submission evaluation runs. So if you want a persistent copy, download and rename the file from Google Drive before your next submission.