Declarative Experiment Description#

This document describes the data structure used to represent an experiment.


The goal of this declarative description is to describe a task graph in a directed acyclic graph (DAG) topology. The nodes of this graph correspond to simple Python functions, where the outputs of some functions may be fed into others as inputs. The operation of the overall graph may be parameterized via a set of name-value pairs. This allows the invoker some control over what the graph of functions does.

Structure and Format#

The structure described below is something which can be easily represented in a file format like JSON or YAML, but one need not use either of those formats, or any textual format at all. The only requirement is that the structure be obtained somehow. It consists of a nesting of basic types, including lists, dicts, ints, etc, in the structure prescribed below. Nevertheless, to describe the structure textually in this document, a familiar format is chosen: YAML. Implementations may make different choices.

Notes on YAML#

A full description of YAML is out of scope for this document, but to help readers less familiar with the syntax, below are some brief notes about how it works.

YAML can represent a data structure similar to JSON. Whereas JSON’s container types include “array” and “object”, in YAML these are called “sequence” and “mapping”. YAML allows flexibility in how these types are serialized (represented textually); these choices are called styles. There are two broad YAML styles, called block style and flow style. The YAML snippets given in this document will use either style, depending on what results in the clearest presentation of the content.

Example YAML mapping, block style:

key1: value1
key2: value2

Example YAML mapping, flow style:

{key1: value1, key2: value2}

Example YAML sequence, block style:

- value1
- value2

Example YAML sequence, flow style:

[value1, value2]

Structure Description#

The top level of the data structure is a mapping with a few prescribed keys, which provide the basic ingredients for the experiment: types, parameters, tasks, and a graph which links task invocations together:

    "<type definitions here>"

    "<parameters here>"

    "<tasks here>"

    "<graph here>"

The rest of the structural description describes what goes in each of those four places.


This section is used to define a set of types. Types describe the inputs and outputs of task plugins, and global parameters. They allow an additional kind of validation of the experiment: that the inputs passed to task plugins are compatible with their parameter types.

The top-level structure of this section is a mapping from type name to type definition:

    type_name_a: type_definition_a
    type_name_b: type_definition_b

There are three kinds of types: simple, structured, and union. These are discussed in the following subsections.

There are a handful of builtin types as well, whose names are reserved: authors must not try to redefine them. These include string, integer, number, boolean, null, and any.

Simple Types#

A simple type is just a name. The name can mean anything you want. Simple types are suitable when you want to express an opaque type, i.e. one where the inner structure is unimportant with regards to type checking. The actual type used by the task plugin may be complex or simple, but the type system will not know anything about it. Simple types are the only types to support single inheritance. A simple type may be given a null definition, or a mapping with an is_a key which maps to the name of its super-type:

        is_a: my_simple_type

Structured Types#

Structured types support the definition of a few kinds of internal structure. This type is necessary when more complex values with internal structure are given, within task plugin invocations and as global parameter values. The type system needs to be able to evaluate the structure of these values for compatibility with task plugin requirements. So it is necessary to be able to associate with a type, a description of its proper structure. The three supported structures are list, tuple, and mapping.

List Structured Types#

A list is conceptually a sequence of values of homogenous type and arbitrary length. To define a list structure, one need only give an element type:

        list: my_elt_type
Tuple Structured Types#

A tuple is conceptually a sequence of values of heterogenous type and fixed length, which may be zero. To define a tuple structure, one needs to list all the element types:

        tuple: [my_elt_type1, my_elt_type2, my_elt_type3]
Mapping Structured Types#

A mapping conceptually associates keys with values. Mapping types come in a couple variants: enumerated and key/value type.

Enumerated Mapping Types#

In an enumerated mapping type, a fixed set of property names and types is given, which may be empty. All values of this type are mappings with exactly that set of properties, i.e. all of the listed properties are required. This implies that keys must be strings, and value types may be heterogenous:

            prop_name1: prop_type1
            prop_name2: prop_type2
Key/Value-Type Mapping Types#

In a key/value type mapping, a key type and value type are given. Values of this type must have keys and values of the given types, but the keys and values themselves are unrestricted. This type of mapping is appropriate when requirements are more open:

        mapping: [string, my_value_type]

There is a special requirement here that the key type must be either the string or integer builtin type.

Union Types#

A union type is a merger of other types. To define a union type, one simply lists the union member types:

        union: [my_type1, my_type2]

Named and Anonymous Types#

A name is optional, for most types. If a type does not have a name, we call it anonymous. Anonymous types arise in two contexts: inline types and type inference. These are the subjects of following sections, but there are some principles and consequences to bring up here.

  • A name is a unique identifier for a type, within the scope of an experiment description. A particular type name always refers to a unique type. If two types have different names, they are different types.

  • A simple type is only a name, therefore it cannot be anonymous.

  • The only way to give a type a name is to map its name to its definition, directly under the top-level types key. Therefore, a simple type definition cannot occur in any other place.

Inline Type Definitions#

The various examples of type definitions given in previous sections were kept simple, in that the elements of those types (where applicable) were references to named types whose definitions were present at the top level. This is the only way to use a simple type, but it is not the only way to use the other kinds of types. Definitions of non-simple types may be nested inside other non-simple type definitions. This enables more convenient definition of complex types. An inline type which is nested within a complex type definition has no means to assign the type a name; it is therefore always an anonymous type.

The following example defines a list of lists:

            list: elt_type

This is a nested mapping, string -> string -> list:

            - string
            - mapping:
                - string
                - list: elt_type


The value of the top-level parameters key must be a mapping. The keys of the mapping are the global parameter names. They may map to any value, where mapping values are treated specially. If not a mapping, the value is taken as the default value for the parameter. If a mapping, it may have keys “type” and/or “default”. The mapping form allows explicitly assigning a type to the parameter. If a type is not explicitly given, it will be inferred from the default. For example:

    global_b: "foo"
        type: integer
        default: 5

Here, global_a has a null default value and null inferred type, global_b defaults to “foo” and is of string type, and global_c defaults to 5 and is explicitly given the integer type.

Additional rules include:

  • If both a default and type are given, they must be compatible.

  • Each global parameter must have a type, therefore it must have either a default value (from which a type can be inferred), or an explicitly given type.


This section describes task plugins. It gives them and their inputs and outputs short names so they may be referred to, and to make subsequent usage easier to write and understand. Inputs and outputs are also assigned types, to enable type-based validation of their invocations.

The value of the top-level tasks key is a mapping whose keys are the short names of the task plugins, and whose values describe the task plugin. Each plugin description supports up to three keys: plugin, inputs, and outputs. The latter two keys are optional.

For example:

        plugin: org.example.plugin1
        inputs: "<input_definitions>"
        outputs: "<output_definitions>"
        plugin: org.example.plugin2
        inputs: "<input_definitions>"
        outputs: "<output_definitions>"

Task Plugin#

The plugin key is required, and describes a Python module plus a function name, separated by dots. This is mostly the same as what you would see in a Python import statement, but with the addition of the function name.


Our implementation will accept plugin name with two components minimum. Giving fewer than two components will produce an error.

Task Inputs#

A task plugin may or may not require any input. If it requires input, that input must be defined under the inputs key. Every input must be given a name and a type. Inputs are defined in a list, which also gives them an implicit ordering. The ordering is important for positional invocations.

Each task plugin input may be defined in one of two ways: either a length 1 mapping which maps input name to type, or a longer form which allows giving other information along with the name and type.

Short form input example:

        plugin: org.example.plugin
            - input_name: type_name

Long form input example:

        plugin: org.example.plugin
            - name: input_name
              type: type_name
              required: false

The long form is distinguished by the presence of the name key. That means if a task plugin input is named “name”, the long form must be used. The above example also shows the required key. Usage of this key is optional and defaults to true, i.e. all defined task plugin inputs are required by default. Long form must be used in order to define an input as optional.

Task Outputs#

A task plugin may or may not produce any output. If it does, and its output(s) will be needed to feed other task(s), then the output(s) must be defined. They require names so they may be referred to, and must also be assigned types so that their usage can be validated against other types. The value of the outputs property may be either a length 1 mapping or list of such. Each mapping maps an output name to a type.

If a list of mappings is given, then the plugin’s output must be iterable (e.g. like a list), and values from the iterable will be extracted and stored under the given names. If a list is given for outputs, but the plugin’s output is not iterable, an error will occur.

If the lengths of the output names and plugin return value are not equal, then the number of values which may be subsequently referred to is the shorter of the two lengths. If the number of defined outputs is less than the length of the output iterable, this allows authors to extract only the first N outputs from the iterable; they need not define all outputs. On the other hand, if there are more output definitions than outputs, then some output names will not be defined (because there are no values to assign to them), and subsequent attempts to refer to them will produce an error.

For example:

        plugin: org.example.plugin
            result: number


The graph section is where you describe invocations of the aforementioned task plugins, and connect the outputs of some to the inputs of others, creating the graph structure.

Graphs are composed of steps, and the value of the graph property is a mapping from a step name to a description of the step. Each step invokes a task plugin, so the step description describes which plugin to invoke and how to invoke it:

        "step 1 description"
        "step 2 description"


Each step invokes a task plugin, which is a Python function, and the way you invoke Python functions depends on how they are written. For example, they may have positional-only or keyword-only arguments, or a combination of the two. A step description supports all of these styles.


It is recommended that of the invocation styles described below, the keyword style be used since the meaning of the argument values is clearer due to the naming of each argument. It is also structurally simpler than the mixed style.

Positional Style Invocation#

A simple way to describe a positional style task plugin invocation is to map the plugin short name to a list of positional arguments:

        plugin1: [arg1, arg2]

The above will lookup “plugin1” in the tasks section to find the plugin, and invoke it with positional parameters “arg1” and “arg2”. In order to enable simple structures, one is permitted to use a simple non-map, non-list value as well, and this will be interpreted the same as a length-one list.

Keyword Style Invocation#

Similarly to positional style, the keyword arg style maps a plugin short name to a mapping of keyword arg names to values:

            keyword1: arg1
            keyword2: arg2

The above will lookup “plugin1” in the tasks section to find the plugin, and invoke it with keyword arguments keyword1=arg1 and keyword2=arg2.

Mixed Style Invocation#

There is a longer form invocation description which supports both styles at the same time, for a mixed style invocation. It uses a mapping with prescribed keys task, args, kwargs:

        task: plugin1
        args: [posarg1, posarg2]
            keyword1: arg1
            keyword2: arg2

This is differentiated from the keyword arg style by the presence of the task key. This will invoke plugin1 with positional args posarg1 and posarg2, and keyword args keyword1=arg1 and keyword2=arg2.

Describing Argument Values#

In the invocation examples above, argument values were given as simple strings to keep the presented data structures clear and free of clutter. However, one can use more than simple values as task plugin argument values. It is possible to use any kind of nested structure you like. For example, a positional argument could itself be a mapping, or the value of a keyword argument could be a list:

            - prop1: value1
              prop2: value2

The above example maps the plugin short name to a list, therefore this is a positional invocation. The list has one value in it, which is itself a mapping: {prop1: value1, prop2: value2}. The task plugin will be invoked positionally, where its one positional argument will be a Python dict.


An important aspect of describing an argument value is being able to refer to another part of the experiment description. This is how we make use of global parameters, and refer to the outputs of other steps.

A reference is a string value which begins with a dollar sign. To refer to a global parameter, follow the dollar sign with the parameter name. To refer to the output of another step, follow the dollar sign with the step name, and if necessary, the output name, separated from the step name by a dot. For example:

    global: string

            - in: string
            value: number
            - in: number
            - name: string
            - age: integer

        plugin1: [$global]
        plugin2: $step1.value

The above example demonstrates a reference to a parameter ($global), and to an output of a step ($step1.value). The value of global will be passed in as a positional arg in step1, and the value output of step1 will be passed in as a positional arg in step2. Note that in step2, we take advantage of the ability to treat a simple value the same as a length-one list.

If a step produces only one output, the output name may be omitted. For example, because step1 only produces one output, that output could have been referred to more simply as $step1.

If one wanted to use a string which starts with a dollar sign as an argument value, the dollar sign may be escaped by doubling it, e.g. $$foo. This need only be done when the dollar sign is the first character. If it is in the middle of the value, e.g. foo$bar, the dollar sign should not be escaped.

References may occur anywhere in the description of an argument value. The whole structure is searched, and all references will be replaced with the appropriate values before the task plugin is invoked.


The references described above can imply dependencies among the steps. If step B requires as input the output of step A, then A must run first so that there is an output to pass to B. The task graph runner can work out for itself a run order for the steps on that basis.

But sometimes there are order requirements which exist for other reasons. A task graph runner will not be able to infer these requirements by itself, so a facility exists for describing them explicitly. To express explicit dependencies in a graph, add a dependencies key to a step description, which maps to a list of step names. For example:

        plugin1: 1
        plugin2: foo
        dependencies: [step1]

This will force step1 to run before step2.

Type Validation#

This section describes how types are used. The larger goal of validation is to ensure that the experiment definition is sensible and correct. The goal of type validation is to ensure that task plugins are always passed types of values they know how to handle, i.e. is sensible with respect to data types. This is not sufficient for correctness of course, but it helps.

All type validation is static, i.e. it is not based on any runtime information about the task plugins. Task plugin output values are not known. It uses only the information given in the various definitions (tasks, types, etc).

Type validation operates on types. Where values are given, types must therefore be inferred from them. The resulting types are then subject to a compatibility check to ensure each task plugin invocation makes sense. Type inference and compatibility checking are the basic foundation of type validation. These two operations are described in the following subsections.

Type Inference#

Input values to a task plugin can come from a few different places: outputs of other task plugins, global parameter values, and literal values given directly in a task plugin invocation. In all cases, the input values must be assigned types, which are then compared to task plugin input requirements for compatibility. Types are statically obtained for invocation inputs as follows:

  • Global parameter: explicitly declared type, if present. If not present, a default value must be given, and a type is inferred from that.

  • Task plugin output: the declared type of the output.

  • Literal value: inferred from the value

Because task invocations support compositions of values and references, type information may be obtained from multiple sources and combined into a single argument type.

When types are given explicitly in task and global parameter definitions, a type name is used which must refer to a type defined in the types section of the experiment description. That is a simple case. Inferring a type from a literal value is more complex, and is described in the next section.

Type Inference from Literals#

Type inference from literals is based on Python types. When experiment description content comes from a file (e.g. YAML or JSON), there is a second translation that can happen from the file format to Python. In that case, it is a two-step process: file -> Python value -> Type. The first step is beyond the scope of this document, but we will include YAML-specific caveats. We will focus only on the second step.

Before we can perform type inference, we need a fixed stable of simple types which we will use as the targets of our inferences. These are built into the system and can’t be overridden.

The following shows the simple types which are pre-defined in Dioptra, and their corresponding Python types:

Type Name

Python Type







integers, subtype of number



floating point numbers



true or false


None (type)

the null value



any value

These types may be referred to using the above names (from column 1) without defining the types first. The any type is a special type which is used as the inference result if nothing more specific could be inferred.


In YAML, null is interpreted as the null value. Therefore, it does not name the null type! To refer to the null type, use double quotes so that the YAML parser knows a string value is intended: "null".

In addition, types can be inferred from common Python data structures which are comprised of values of the above types. This results in anonymous structured types as follows:


Python Type


anonymous mapping


A mapping

anonymous tuple


An iterable of values

A tuple is inferred for iterables which are not one of the other listed iterable types (e.g. strings and dicts). This enables a sensible inference result from Python lists and tuples. List structured types are never inferred, but tuples can be compatible with lists, so it is not a problem in practice.

The element types of the inferred type are inferred from the element types of the mapping/iterable value. Type inference will recurse throughout a complex value, so it is possible to infer a complex type. Mapping inference is more complex and is described separately in the next section.

Inferring Mapping Types#

In Python, mapping keys can be heterogenous and of any hashable type, and anything can be a value. Type inference tries to infer the most specific mapping type it can, as follows:

  1. A common base type is inferred from all key types. If this type is string, an enumerated mapping type is inferred. The value type corresponding to each key type is inferred from the mapped value.

  2. If the common base type of the keys is integer, a key/value type mapping is inferred where the key type is integer. The value type used is the union of all value types inferred from the mapping, or if all inferred value types were the same, then that type.

  3. If the common base key type is neither string nor integer, any is inferred.

  4. If the mapping is empty, an empty enumerated mapping type is inferred.

Type Compatibility#

Type compatibility refers to whether a task plugin invocation argument of one type may be legally passed as the value of a task parameter of a possibly different type. If A “is compatible with” B, then values of type A may be passed to task parameters of type B. This is not a symmetric concept: if A is compatible with B, that does not mean that B is compatible with A.

There is a fixed set of compatibility rules, which hopefully agrees with intuition. To begin with:

  • All types are compatible with any

  • any is incompatible with all structured types, and all simple types except any

  • All types are compatible with themselves

  • A simple type is compatible with another simple type iff the first is the same as or a subtype of the second

The following subsections describe more complex compatibility criteria related to structured and union types.

Union Compatibility#

Compatibility rules related to union types are as follows:

  • A union type is compatible with another type iff all member types of the union are compatible with the latter type

  • A non-union type is compatible with a union type iff the non-union type is compatible with at least one member of the union

  • The empty union is compatible with all types

  • Only the empty union is compatible with the empty union

Structured Type Compatibility#

Compatibility rules related to structured types are as follows:

  • Named structured types with different names are incompatible. This is because the types could mean wildly different things, regardless of whether they are structurally compatible. A structural inspection is only necessary if at least one of the types is anonymous. (If the types are non-anonymous with equal names, the types are the same, so trivially compatible.)

  • Lists are incompatible with tuples and mappings

  • Tuples are incompatible with mappings

  • Mappings are incompatible with tuples and lists

  • Key/value type mappings are incompatible with enumerated mappings

  • A list is compatible with another list iff the element type of the first list is compatible with the element type of the second

  • A tuple type is compatible with another tuple type iff the number of element types of each tuple is equal, and corresponding element types are compatible

  • A tuple type is compatible with a list type iff all tuple element types are compatible with the list element type

  • An enumerated mapping type is compatible with another enumerated mapping type iff they have exactly the same property names, and corresponding property types are compatible

  • A key/value type mapping is compatible with another key/value type mapping iff key types are compatible and value types are compatible

  • An enumerated mapping type is compatible with a key/value type mapping iff the key type of the latter mapping is string, and all value types of the former mapping are compatible with the value type of the latter mapping

  • The empty enumerated mapping type is compatible with all string-keyed key/value type mappings.

Type Variance#

“Type variance” refers to whether and how types are allowed to differ in various contexts (typically related to programming languages). An experiment is analogous to a program in that functions are called, which are passed values as arguments, and return values. The relevant context for these experiments is the task plugin invocations, and how task plugin invocation argument types are allowed to differ from the parameter types.

An important question which comes up is usually expressed in terms of container types: if type B is a subtype of A, should we allow a value of type list<B> (i.e. “list of B”) to be passed to a parameter of type list<A>? This is not always allowed in programming languages due to type safety: the callee, using the list according to the list<A> type, could put instances of A or instances of other subtypes of A (which are not B) into the list. If the caller expects their list to still be a list<B> after the call returns, there could be trouble down the road.

In other words, there is the potential for the callee to side-effect its input in ways which are unsafe. If a task plugin changes an input and that input was to be passed to multiple other task plugins, the other task plugins which are subsequently passed that input will see the change. One might in fact expect that any change to an input would be unexpected to an experiment designer using your plugin, who assumes that a plugin output would be passed undisturbed to all downstream plugins.

We have adopted the assumption that task plugins will never side-effect their inputs. Therefore the covariant compatibility decision is safe under that assumption. It is not possible to enforce immutability of task inputs, since task plugins are ordinary Python functions using ordinary Python types. But we encourage task plugin authors to treat their inputs immutably. Instead of changing an input, make a changed copy of the input and return that from the plugin.