Open In Colab

Reading and Writing Pipelines#

This guide demonstrates how to save and load data processing pipelines in AFL. You’ll learn how to:

  • Create and save pipelines to JSON format

  • Load existing pipelines from saved files

  • Use the Pipeline Builder to export pipeline configurations

  • Manage pipeline templates and reusable workflows

Pipelines in AFL can be saved as JSON files that contain all the operation configurations, parameters, and connections. This allows you to share workflows, version control your analysis procedures, and quickly reproduce results.

Google Colab Setup#

Only uncomment and run the next cell if you are running this notebook in Google Colab or if don’t already have the AFL-agent package installed.

[ ]:
# !pip install git+https://github.com/usnistgov/AFL-agent.git

Writing A Pipeline#

To begin, let’s load the necessary libraries and define a short pipeline

[1]:
from AFL.double_agent import *

with Pipeline('MyPipeline') as my_important_pipeline:

        SavgolFilter(
            input_variable='measurement',
            output_variable='derivative',
            dim='x',
            derivative=1
            )

        Similarity(
            input_variable='derivative',
            output_variable='similarity',
            sample_dim='sample',
            params={'metric': 'laplacian','gamma':1e-4}
            )

        SpectralClustering(
            input_variable='similarity',
            output_variable='labels',
            dim='sample',
            params={'n_phases': 2}
            )

my_important_pipeline.print()
PipelineOp                               input_variable ---> output_variable
----------                               -----------------------------------
0  ) <SavgolFilter>                      measurement ---> derivative
1  ) <SimilarityMetric>                  derivative ---> similarity
2  ) <SpectralClustering>                similarity ---> labels

Input Variables
---------------
0) measurement

Output Variables
----------------
0) labels

Now, we can write the pipeline by simply calling the .write_json() method

[2]:
my_important_pipeline.write_json('pipeline.json')
Pipeline successfully written to pipeline.json.

Excellent! Let’s take a look at the json file that was written. We can inspect it using the json module

[4]:
import json

with open('pipeline.json','r') as f:
    display(json.load(f))
{'name': 'MyPipeline',
 'date': '03/04/25 19:59:19-576491',
 'ops': [{'class': 'AFL.double_agent.Preprocessor.SavgolFilter',
   'args': {'input_variable': 'measurement',
    'output_variable': 'derivative',
    'dim': 'x',
    'xlo': None,
    'xhi': None,
    'xlo_isel': None,
    'xhi_isel': None,
    'pedestal': None,
    'npts': 250,
    'derivative': 1,
    'window_length': 31,
    'polyorder': 2,
    'apply_log_scale': True,
    'name': 'SavgolFilter'}},
  {'class': 'AFL.double_agent.PairMetric.Similarity',
   'args': {'input_variable': 'derivative',
    'output_variable': 'similarity',
    'sample_dim': 'sample',
    'params': {'metric': 'laplacian', 'gamma': 0.0001},
    'constrain_same': [],
    'constrain_different': [],
    'name': 'SimilarityMetric'}},
  {'class': 'AFL.double_agent.Labeler.SpectralClustering',
   'args': {'input_variable': 'similarity',
    'output_variable': 'labels',
    'dim': 'sample',
    'params': {'n_phases': 2},
    'name': 'SpectralClustering',
    'use_silhouette': False}}]}

With this, we can see that all of the PipelineOps are stored in the ops keyword with the keyword arguments we specified above. Also included are any default arguments that we didn’t explicitly specify.

Reading a Pipeline#

So the next and final step is to load the pipeline from disk.

[6]:
loaded_pipeline = Pipeline.read_json('pipeline.json')
loaded_pipeline.print()
PipelineOp                               input_variable ---> output_variable
----------                               -----------------------------------
0  ) <SavgolFilter>                      measurement ---> derivative
1  ) <SimilarityMetric>                  derivative ---> similarity
2  ) <SpectralClustering>                similarity ---> labels

Input Variables
---------------
0) measurement

Output Variables
----------------
0) labels

Success!

Conclusion#

In this guide, we’ve learned how to save and load AFL pipelines using JSON files. This capability is essential for:

  • Reproducibility: Save your analysis workflows to ensure consistent results

  • Sharing: Distribute pipelines to colleagues or collaborators

  • Version Control: Track changes to your analysis methods over time

  • Automation: Load pre-built pipelines in scripts or applications

The JSON format preserves all pipeline operations, their parameters, and the connections between them, making it a robust way to persist your data processing workflows. Combined with the Pipeline Builder GUI, you can create, save, and reuse complex analysis pipelines with ease.