6. Settings File

MOSAIC stores its settings in the JSON format. When using the graphical interface, a settings file is generated automatically upon starting an analysis, or by clicking Save Settings in the File menu (see MOSAIC GUI).

6.1. Settings Layout

JSON is a human readable file format that consists of key-value pairs separated by sections. Each section in a JSON object consists of a section name and a list of string key-value pairs.

{
        "<section name>" : {
                "key1" : "value1",
                "key2" : "value2",
                ...
        }

}

MOSAIC settings define a new section for each class, with key-value pairs corresponding to class attributes that are set upon initialization. This is illustrated below for the adept2State class. The adept2State section in the settings file holds parameters corresponding to the adept2State class. Note that that the section name in the settings file is identical to the corresponding class name. Three parameters are then defined within the section that control the behavior of the class.

{
        "adept2State" : {
                "FitTol"                        : "1.e-7",
                "FitIters"                      : "50000",
                "BlockRejectRatio"              : "0.9"
        }
}

Finally, adept2State is initialized by defining class attributes corresponding to the key-value pairs in the settings file.

try:
                self.FitTol=float(self.settingsDict.pop("FitTol", 1.e-7))
                self.FitIters=int(self.settingsDict.pop("FitIters", 5000))

                self.BlockRejectRatio=float(self.settingsDict.pop("BlockRejectRatio", 0.8))

except ValueError as err:
        raise commonExceptions.SettingsTypeError( err )

6.2. Trajectory Settings

6.2.1. Common Settings (metaTrajIO)

class mosaic.trajio.metaTrajIO.metaTrajIO(**kwargs)[source]

Warning

This metaclass must be sub-classed. All abstract methods within this metaclass must be implemented.

Initialize a TrajIO object. The object can load all the data in a directory, N files from a directory or from an explicit list of filenames. In addition to the arguments defined below, implementations of this meta class may require the definition of additional arguments. See the documentation of those classes for what those may be. For example, the qdfTrajIO implementation of metaTrajIO also requires the feedback resistance (Rfb) and feedback capacitance (Cfb) to be passed at initialization.

Parameters:
  • dirname : all files from a directory (‘<full path to data directory>’)

  • nfiles : if requesting N files (in addition to dirname) from a specified directory

  • fnames : explicit list of filenames ([file1, file2,…]). This argument cannot be used in conjuction with dirname/nfiles. The filter argument is ignored when used in combination with fnames.

  • filter : ‘<wildcard filter>’ (optional, filter is ‘*’ if not specified)

  • start : Data start point in seconds.

  • end : Data end point in seconds.

  • datafilter : Handle to the algorithm to use to filter the data. If no algorithm is specified, datafilter is None and no filtering is performed.

  • dcOffset : Subtract a DC offset from the ionic current data.

  • filtersettings: Dict containing low pass filter settings (optional: if not provided filter settings will be loaded from the settings file. If no settings are found, datafilter will be turned off.)

Properties:
  • FsHz : sampling frequency in Hz. If the data was decimated, this property will hold the sampling frequency after decimation.

  • LastFileProcessed : return the data file that was last processed.

  • ElapsedTimeSeconds : return the analysis time in sec.

Errors:
  • IncompatibleArgumentsError : when conflicting arguments are used.

  • EmptyDataPipeError : when out of data.

  • FileNotFoundError : when data files do not exist in the specified path.

  • InsufficientArgumentsError : when incompatible arguments are passed

6.2.2. QDF Files (qdfTrajIO)

class mosaic.trajio.qdfTrajIO.qdfTrajIO(**kwargs)[source]

Use the readqdf module from EBS to read individual QDF files.

In addition to metaTrajIO args, check if the feedback resistance (Rfb) and feedback capacitance (Cfb) are defined to convert qdf binary data into pA.

A typical settings section to read QDF files is shown below. Note, that the values for Rfb and Cfb are specific to the amplifier used.

"qdfTrajIO": {
"Rfb"                           : 9.1e+12,
"Cfb"                           : 1.07e-12,
"dcOffset"                      : 0.0,
"filter"                        : "*.qdf",
"start"                         : 0.0
}
Parameters:
In addition to metaTrajIO.__init__ args,
  • Rfb : feedback resistance of amplifier

  • Cfb : feedback capacitance of amplifier

  • format : ‘V’ for voltage or ‘pA’ for current. Default is ‘V’

Returns:

None

Errors:
  • InsufficientArgumentsError : if the mandatory arguments Rfb and Cfb are not set.

6.2.3. ABF Files (abfTrajIO)

class mosaic.trajio.abfTrajIO.abfTrajIO(**kwargs)[source]

Read ABF1 and ABF2 file formats. Currently, only gap-free mode and single channel recordings are supported.

A typical settings section to read ABF files is shown below.

"abfTrajIO" : {
"filter"                        : "*.abf",
"start"                         : 0.0,
"dcOffset"                      : 0.0,
"sweepNumber"                                   : 0,
"channel"                                               : 0
}
Parameters:
In addition to metaTrajIO args,

None

6.2.4. Binary Files (binTrajIO)

class mosaic.trajio.binTrajIO.binTrajIO(**kwargs)[source]

Read a file that contains interleaved binary data, ordered by column. Only a single column that holds ionic current data is read. The current in pA is returned after scaling by the amplifier scale factor (AmplifierScale) and removing any offsets (AmplifierOffset) if provided.

Usage and Assumptions:

Binary data is interleaved by column. For three columns (a, b, and c) and N rows, binary data is assumed to be of the form:

[ a_1, b_1, c_1, a_2, b_2, c_2, … … …, a_N, b_N, c_N ]

The column layout is specified with the ColumnTypes parameter, which accepts a list of tuples. For the example above, if column a is the ionic current in a 64-bit floating point format, column b is the ionic current representation in 16-bit integer format and column c is an index in 16-bit integer format, the ColumnTypes paramter is a list with three tuples, one for each column, as shown below:

[(‘curr_pA’, ‘float64’), (‘AD_V’, ‘int16’), (‘index’, ‘int16’)]

The first element of each tuple is an arbitrary text label and the second element is a valid Numpy type.

Finally, the IonicCurrentColumn parameter holds the name (text label defined above) of the column that holds the ionic current time-series. Note that if an integer column is selected, the AmplifierScale and AmplifierOffset parameters can be used to convert the voltage from the A/D to a current.

Assuming that we use a floating point representation of the ionic current, and a sampling rate of 50 kHz, a settings section that will read the binary file format defined above is:

"binTrajIO": {
        "AmplifierScale" : "1",
        "AmplifierOffset" : "0",
        "SamplingFrequency" : "50000",
        "ColumnTypes" : "[('curr_pA', 'float64'), ('AD_V', 'int16'), ('index', 'int16')]",
        "IonicCurrentColumn" : "curr_pA",
        "dcOffset": "0.0", 
        "filter": "*.bin", 
        "start": "0.0",
        "HeaderOffset": 0 
}
Settings Examples:

Read 16-bit signed integers (big endian) with a 512 byte header offset. Set the amplifier scale to 400 pA, sampling rate to 200 kHz.

"binTrajIO": {
        "AmplifierOffset": "0.0", 
        "SamplingFrequency": 200000, 
        "AmplifierScale": "400./2**16", 
        "ColumnTypes": "[('curr_pA', '>i2')]", 
        "dcOffset": 0.0, 
        "filter": "*.dat", 
        "start": 0.0, 
        "HeaderOffset": 512, 
        "IonicCurrentColumn": "curr_pA"
}

Read a two-column file: 64-bit floating point and 64-bit integers, and no header offset. Set the amplifier scale to 1 and sampling rate to 200 kHz.

"binTrajIO": {
        "AmplifierOffset": "0.0", 
        "SamplingFrequency": 200000, 
        "AmplifierScale": "1.0", 
        "ColumnTypes" : "[('curr_pA', 'float64'), ('AD_V', 'int64')]",
        "dcOffset": 0.0, 
        "filter": "*.bin", 
        "start": 0.0, 
        "HeaderOffset": 0, 
        "IonicCurrentColumn": "curr_pA"
}
Parameters:
In addition to metaTrajIO args,
  • AmplifierScale : Full scale of amplifier (pA/2^nbits) that varies with the gain (default: 1.0).

  • AmplifierOffset : Current offset in the recorded data in pA (default: 0.0).

  • SamplingFrequency : Sampling rate of data in the file in Hz.

  • HeaderOffset : Ignore first n bytes of the file for header (default: 0 bytes).

  • ColumnTypes : A list of tuples with column names and types (see Numpy types). Note only integer and floating point numbers are supported.

  • IonicCurrentColumn : Column name that holds ionic current data.

Returns:

None

Errors:

None

6.2.5. Chimera Files (binTrajIO)

class mosaic.trajio.chimeraTrajIO.chimeraTrajIO(**kwargs)[source]

Read a file generated by the Chimera VC100. The current in pA is returned after scaling by the amplifier scale factors.

Usage and Assumptions:

Binary data is in a single column of unsigned 16 bit integers:

The column layout is specified with the ColumnTypes parameter, which accepts a list of tuples.

[(‘curr_pA’, ‘<u2’)]

The option is left in in case of future changes to the platform, but can be left alone in the settings file for now. The first element of each tuple is an arbitrary text label and the second element is a valid Numpy type.

Chimera gain settings are used to convert the integers stored by the ADC to current values. These values are automatically read in from matched MAT files generated by the Chimera software.

"chimeraTrajIO": {
        "filter": "*.log", 
        "start": "0.0",
        "HeaderOffset": "0"
}
Parameters:

In addition to metaTrajIO args,

  • HeaderOffset : Ignore first n bytes of the file for header (currently fixed at: 0 bytes).

Returns:

None

Errors:

None

6.2.6. TSV Files (binTrajIO)

class mosaic.trajio.tsvTrajIO.tsvTrajIO(**kwargs)[source]

Read tab separated valued (TSV) files.

Parameters:
In addition to metaTrajIO args,
  • headers : If True, the first row is ignored (default: True)

  • separator : set the data separator (defualt: ‘”"t’)

  • scale : set the data scale (default: 1). For example to convert from to pA set scale=1e12.

Either:
  • Fs : Sampling frequency in Hz. If set, all other options are ignored and the first column in the file is assumed to be the current in pA.

Or:
  • nCols : number of columns in TSV file (default:2, first column is time in ms and second is current in pA)

  • timeCol : explicitly set the time column (default: 0, first col)

  • currCol : explicitly set the position of the current column (default: 1)

If neither Fs nor {nCols, timeCol, currCol} are set then the latter is assumed with the listed default values.

6.3. Optimizing Settings

MOSAIC classes are controlled through the JSON settings files as defined above. In most cases, running MOSAIC through the GUI (see MOSAIC GUI) should generate satisfactory results. However, settings can be further optimized either by editing a file named .settings stored within the data directory, or by clicking on the Advanced Settings check-box in the Panel A: Analysis Setup section of the GUI.

6.3.1. Initial Event Detection (eventSegment)

The first step when analyzing an ionic-current time series is to perform a quick partition to identify events. This is accomplished by overriding the eventPartition class. Currently, the only implementation of event partitioning is the eventSegment algorithm. This algorithm uses a thresholding technique to detect the start and end of an event. When an event is detected the ionic current time-series associated with that event is passed to a processing algorithm for fitting. Settings that can be passed to eventSegment are given below followed by their descriptions.

"eventSegment" : {
        "blockSizeSec"                  : "0.5",
        "eventPad"                      : "50",
        "minEventLength"                : "5",
        "eventThreshold"                : "6.0",
        "driftThreshold"                : "999.0",
        "maxDriftRate"                  : "999.0",
        "meanOpenCurr"                  : "-1",
        "sdOpenCurr"                    : "-1",
        "slopeOpenCurr"                 : "-1",
        "writeEventTS"                  : "1",
        "parallelProc"                  : "0",
        "reserveNCPU"                   : "2"
}

Setting

Description

blockSizeSec

eventPad

minEventLength

eventThreshold

meanOpenCurr

sdOpenCurr

slopeOpenCurr

driftThreshold

maxDriftRate

writeEventTS

parallelProc

reserveNCPU

Time-series length (in sec) for block operations.

Pad an event with the specified number of points.

Discard events with fewer than the specfied points.

Event detection threshold.

Set the mean open channel current (i0) in pA. -1 computes i0 automatically.

Set the open channel std. dev. in pA. -1 computes SD automatically.

Set the open channel drift in pA/ms. -1 automatically computes the slope.

Aborts the analysis when the open channel drift exceeds the specified value.

Aborts the analysis when the open channel slope exceeds the specified value (pA/ms).

Write the event time-series to the output database.

Enable parallel processing.

Use N-reserveNCPU for parallel processing.

6.3.2. Two-State Identification (adept2State)

Once the time-series is partitioned, individual events are processed by a processing algorithm. For simple event patterns (e.g. homopolymers of DNA, PEG, etc.), one can use the ADEPT 2-State algorithm. Settings that can be passed to this algorithm are below, followed by their descriptions. For a vast majority of cases, the settings below can be used without modification.

"adept2State" : {
        "FitTol"                        : "1.e-7",
        "FitIters"                      : "50000"
}

6.3.3. Multi-State Identification (adept)

For more complex signals with multiple states, the ADEPT algorithm yields better results. The settings passed to this algorithm (described below) are largely similar to Two-State Identification (adept2State).

"adept" : {
        "FitTol"                        : "1.e-7",
        "FitIters"                      : "50000",
        "InitThreshold"                 : "3.0"
}

Hint

The parameter InitThreshold is used for preliminary state identification within multi-state events. As a rule of thumb, this value should be set to roughly half that of eventThreshold in the Initial Event Detection (eventSegment) section. However, the final value may be adjusted further for optimal results.

6.4. Default Settings

{
        "eventSegment" : {
                "blockSizeSec"                  : "0.5",
                "eventPad"                              : "50",
                "minEventLength"                : "5",
                "eventThreshold"                : "6.0",
                "driftThreshold"                : "999.0",
                "maxDriftRate"                  : "999.0",
                "meanOpenCurr"                  : "-1",
                "sdOpenCurr"                    : "-1",
                "slopeOpenCurr"                 : "-1",
                "writeEventTS"                  : "1",
                "parallelProc"                  : "0",
                "reserveNCPU"                   : "2"
        },
        "singleStepEvent" : {
                "binSize"                               : "1.0",
                "histPad"                               : "10",
                "maxFitIters"                   : "5000",
                "a12Ratio"                              : "1.e4",
                "minEvntTime"                   : "10.e-6",
                "minDataPad"                    : "75"
        },
        "adept2State" : {
                "FitTol"                                : "1.e-7",
                "FitIters"                              : "50000"
        },
        "adept" : {
                "FitTol"                : "1.e-7",
                "FitIters"              : "50000",
                "InitThreshold" : "3.0"
     },
     "cusumPlus": {
                        "StepSize"              : 3.0,
                        "Threshold"             : 3.0
},
        "besselLowpassFilter" : {
                "filterOrder"                   : "6",
                "filterCutoff"                  : "10000",
                "decimate"                              : "1"
        },
        "waveletDenoiseFilter" : {
                "wavelet"                               : "sym5",
                "level"                                 : "5",
                "thresholdType"                 : "soft",
                "thresholdSubType"              : "sqtwolog"
        },
        "abfTrajIO" : {
                "filter"                                : "*.abf",
                "start"                                 : 0.0,
                "dcOffset"                              : 0.0
        },
        "qdfTrajIO": {
                "Rfb": 9.1e+12,
                "Cfb": 1.07e-12,
                "dcOffset": 0.0,
                "filter": "*.qdf",
                "start": 0.0
        },
        "binTrajIO": {
                "AmplifierScale": "1.0",
                "AmplifierOffset": "0.0",
                "SamplingFrequency": "50000",
                "HeaderOffset": "0",
                "ColumnTypes": "[('curr_pA', 'float64')]",
                "IonicCurrentColumn" : "curr_pA",
                "dcOffset": "0.0",
                "filter": "*.bin",
                "start": "0.0"
        }
}