Artifacts#
Summary: What is an Artifact?#
An artifact refers to a stored output of a job that is managed by Dioptra. A job can produce multiple artifacts, and these artifacts can be used as inputs to other jobs. Artifact usage is specified in the entrypoint associated with the job, but the artifact itself is provided to the job at runtime.
Artifacts are used in entrypoints through Artifact Parameters. When an artifact is designated as
an input parameter, it can be referenced in the task graph in the exact same way as any regular
entrypoint parameter or task output. The artifact is loaded into memory at the start of the
job execution and then is available for any tasks that reference it.
To use an artifact as an entrypoint input parameter, the output type from the deserialize
method of the Plugin Artifact Task associated with the artifact must match
the type for the artifact input parameter. An artifact cannot be selected for a
job if these types don’t align.
Plugin Artifact Tasks#
Plugin Artifact tasks are a type of plugin task that details the serialization and deserialization of a given artifact type. When an output of a function task is designated to be saved as an artifact, the ouput is passed to the serialization function within the desired plugin artifact task. Conversely, when an artifact is loaded, the deserialization function of the plugin artifact task is used to load the artifact as an object in memory.
When serialized, Diotpra takes the serialized file and stores it in the backend data storage (S3 by default). The URL of the artifact is stored in the database, and retrieved by ID when the artifact is used or downloaded from the GUI.
An artifact task consists of a class which implements the ArtifactTaskInterface interface.
This interface supports three methods:
serialize- using the provided source input creates an artifact with a given name to the specified directorydeserialize- used to read the contents of an artifact with a given path relative to the specified directoryvalidation- used to validate any keyword arguments passed intoserialize()
The serialize function supports passing additional keyword arguments. This can be useful
for selecting file types or configuring other settings when saving an artifact.
Entrypoints can designate the output of a function task as an artifact by referencing an artifact task for serialization. When used as a parameter to another entrypoint, the deserialization function of that artifact task will be used.
Note that when artifacts are created, they are associated with a snapshot of the artifact task that they were created with. Since the artifact task contains both serialization and deserialization, the same snapshot is used for deserialization.
Artifact tasks are registered similarly to function tasks. See Plugins for more details.
See Also#
Entrypoints Explanation - Explanation of Entrypoints, including the Artifact Task Graph
Entrypoints Reference - Complete YAML syntax guide for entrypoint files and task graphs
Artifacts Reference - Complete reference for Artifacts
Artifact Output Graph Reference - Reference for artifacts output graphs
Plugins Explanation - Explanation of Plugins, Function Tasks and Artifact Tasks