2. Data Processing Algorithms in MOSAIC¶
There are three primary algorithms available in MOSAIC to process time-series data from single-molecule nanopore experiments. Fitting-based approaches are outlined in the Introduction, are implemented in MOSAIC using two separate algorithms, i) StepResponseAnalysis is used for events that exhibit a single state, and ii) MultistateAnalysis for N-state events. In addition, the CUSUM algorithm is available for N-state events.
2.1. ADEPT 2-State¶
This algorithm limits the generalized algorithm for state-detection [] to cases with a single state as seen in the figure below. This simplified approach speeds up the analysis considerably and is appropriate to use for many applications, for example the detection of PEG, small molecules, DNA homopolymers, etc. The adept2State
class uses a simplified form of the expression for the ionic current across a nanopore as shown below. Settings that control the fit are defined through the settings file and are described in more detail in the Optimizing Settings section. This functional form is fit to a time-series from a single event to recover optimal parameters for the mdoel.
This simplification speeds up the analysis for two state events like the PEG event in the figure below. The figure shows the results of the fit (or meta-data) superimposed on the time-series of a single event.

2.1.1. Algorithm Settings¶
2.1.2. Metadata Output¶
Meta-data for individual events generated by adept2State
can be queried using SQLite as described in the Database Structure and Query Syntax section. A list of meta-data stored by the step response algorithm is given below.
Column Name |
Column Type |
Description |
---|---|---|
recIDX ProcessingStatus OpenChCurrent BlockedCurrent EventStart EventEnd BlockDepth ResTime RCConstant1 RCConstant2 AbsEventStart ReducedChiSquared ProcessTime TimeSeries |
INTEGER TEXT REAL REAL REAL REAL REAL REAL REAL REAL REAL REAL REAL REAL_LIST |
Record index. Status of the analysis. Open channel current in pA. Blocked state current in pA. Event start in ms. Event end in ms. BlockedCurrent/OpenChCurrent. EventEnd-EventStart in ms. Downstroke RC constant in ms. Upstroke RC constant in ms. Global event start time in ms. Reduced Chi-squared of fit. Event processing time in ms. (OPTIONAL) Event time-series. |
2.2. ADEPT¶
The multistate algorithm implements the general case for identifying states in nanopore data []. The general form of the equation used in this algorithm is shown below, where N is the number of states. This functional form is fit to a time-series from a single event to recover optimal parameters for the mdoel.
Settings that control the fit are defined through the settings file and are described in more detail in the Optimizing Settings section. Upon successfully fitting the model to an event, adept
generates meta-data the describes the individual states in the event. A representative example of one such event is shown in the figure below.

2.2.1. Algorithm Settings¶
2.2.2. Metadata Output¶
The adept
algorithm outputs meta-data that characterizes every processed event. Similar to the ADEPT 2-State algorithm, this information is stored in a SQLite database and is available for further processing (see Database Structure and Query Syntax). Notably, the data output by adept
differs from adept2State
in one important way. Because the number of states (NStates) detected in each event is not pre-determined, key meta-data (e.g. BlockDepth, EventDelay, etc.) are stored as arrays of real numbers with length equal to NStates.
Column Name |
Column Type |
Description |
---|---|---|
recIDX ProcessingStatus OpenChCurrent NStates CurrentStep BlockDepth EventStart EventEnd EventDelay StateResTime ResTime RCConstant AbsEventStart ReducedChiSquared ProcessTime TimeSeries |
INTEGER TEXT REAL INTEGER REAL_LIST REAL_LIST REAL REAL REAL_LIST REAL_LIST REAL REAL_LIST REAL REAL REAL REAL_LIST |
Record index. Status of the analysis. Open channel current in pA. Number of detected states. Blocked current steps in pA. BlockedCurrent/OpenChCurrent for each state. Event start in ms. Event end in ms. Start time of each state in ms. Residence time of each state in ms. EventEnd-EventStart in ms. System RC constant in ms. Global event start time in ms. Reduced Chi-squared of fit. Event processing time in ms. (OPTIONAL) Event time-series. |
2.3. CUSUM+¶
The CUSUM algorithm (used by OpenNanopore for example) [] is available in MOSAIC. In contrast with other algorithms available in MOSAIC, this approach does not leverage system information in the analysis. This however results in a faster estimation of single- and multi-level events, compared with ADEPT 2-State and ADEPT. You can read about the CUSUM algorithm here.
Some known issues with CUSUM:
If the duration of a sub-event is shorter than a five RC constants, the averaging will underestimate the extent of the current change. For longer events, CUSUM should achieve very similar output to the fitting employed elsewhere in MOSAIC.
CUSUM assumes an instantaneous transition between current states. As a result, if the RC rise time of the system is large, CUSUM can trigger and detect intermediate states. This can usually be mitigated by optimizing the algorithm sensitivity settings.
If an event is very long, CUSUM will detect a state transistion even if there is no real change, leading to an artificially high number of states. This is a consequence of false positives from using a statistical t-test. In some cases this can be mitigated by reducing the sensitivity.
Settings that control the algorithm are defined through the settings file, as described the Optimizing Settings section. Upon successfully analyzing an event, cusumPlus
generates meta-data the describes the individual states in the event. A representative example of one such event is shown in the figure below.

2.3.1. Algorithm Settings¶
2.3.2. Metadata Output¶
The cusumPlus
algorithm outputs meta-data that characterizes every processed event. Similar to the ADEPT algorithm, this information is stored in a SQLite database and is available for further processing (see Database Structure and Query Syntax).
Column Name |
Column Type |
Description |
---|---|---|
recIDX ProcessingStatus OpenChCurrent NStates CurrentStep BlockDepth EventStart EventEnd EventDelay StateResTime ResTime AbsEventStart ProcessTime TimeSeries |
INTEGER TEXT REAL INTEGER REAL_LIST REAL_LIST REAL REAL REAL_LIST REAL_LIST REAL REAL REAL REAL_LIST |
Record index. Status of the analysis. Open channel current in pA. Number of detected states. Blocked current steps in pA. BlockedCurrent/OpenChCurrent for each state. Event start in ms. Event end in ms. Start time of each state in ms. Residence time of each state in ms. EventEnd-EventStart in ms. Global event start time in ms. Event processing time in ms. (OPTIONAL) Event time-series. |