7. Database Structure and Query Syntax

MOSAIC stores the output of an analysis in a SQLite database. Database files are stored in the same directory as the data being processed. Each analysis creates a new database file named eventMD-<date>-<time>.sqlite, where <date> is the date the analysis was performed (e.g. 20140929 for Sep 29, 2014) and <time> is the analysis start time (e.g. 112937 for 11:29:37 AM).

SQLite databases store data in tables similar to spreadsheets, where each table is analogous to a sheet in an Excel spreadsheet. Databased generated by MOSAIC can be inspected using a database viewer, for example the open source DB browser for SQLite. MOSAIC outputs databases with multiple tables as seen from the figure below. Databases output by MOSAIC contain four tables: i) analysisinfo contains general information about the anlysis such as the data path, analysis algorithm etc., ii) analysissettings contains a JSON formatted string with the analysis settings, iii) metadata holds the output of the analysis, and iv) metadata_t lists the data types for each column in metadata. Two tables most relevant to the analysis (metadata and analysissettings) are discussed in detail below.

../_images/sqlstructFig1.png

7.1. Metadata Table

The metadata table contains the primary output of the analysis. MOSAIC processes individual blockade events from a time-series of ionic current. The parameters describing each event (or metadata) are stored in individual rows of the metadata table in the database file. The column names describe the metadata and are unique to the processing algorithm used. For example, the column names for the ADEPT 2-State algorithm are shown below. The column names for ADEPT differ from this list.

{
    ProcessingStatus,
    OpenChCurrent,
    BlockedCurrent,
    EventStart,
    EventEnd,
    BlockDepth,
    ResTime,
    RiseTime,
    AbsEventStart,
    RedChiSq,
    TimeSeries
}

Note that the column names can be used in constructing queries passed to SQLite, and is described in more detail in the Work with SQLite section and the Scripting and Advanced Features section. The first example SQL query below returns the BlockDepth column (ratio of BlockedCurrent to OpenChCurrent). One can imagine assembling more complex queries for example restricting the results to events whose residence time is greater than 0.2 ms as seen from the second example query below.

select BlockDepth from metadata where ProcessingStatus='normal'

select BlockDepth from metadata where ProcessingStatus='normal' and ResTime > 0.2

A typical metadata table for the ADEPT 2-State algorithm is shown below. The ProcessingStatus column is a text field that should read normal if the fit for a particular event was successful. If a failure occurred the corresponding error code (e.g. eInvalidFitParameters) is stored and all other columns (except TimeSeries) are set to -1. If event time-series storage was requested, then the TimeSeries column will store the ionic current data for that entry in binary format.

../_images/sqlstructFig2.png

7.2. Analysis Settings Table

The analysissettings table contains a single text entry that stores the settings file for the analysis. This allows any database opened with the MOSAIC GUI to retrieve settings that correspond to the analysis results in the file. As seen from the figure below, the settings file is in the JSON format as described in the Settings File documentation.

../_images/sqlstructFig3.png

7.3. Work with SQLite

MOSAIC stores the output of an analysis in a SQLite database as described in the Database Structure and Query Syntax section. Interacting with the data through the Structured Query Language (SQL) is a flexible approach to further analyze or plot the output. Here we provide a few detailed examples of the common ways in which the output of MOSAIC can be queried for further processing. While this section is not a comprehensive SQL tutorial, it provides common use cases to allow you to get started.

One way to retrieve data from a SQLite database is to use the select command. In its simplest form, a select query can return the entire contents of a table using the syntax below. The statement below selects all columns (select *) from the table specified by <tablename>.

select * from <tablename>

The power of SQL lies in its ability to restrict results to match specific criteria. This is accomplished with the where clause described next. SQL queries can be very fast event for large databases. It is often desirable to only include events that were successfully fit in a plot or other analysis. All eventprocess-page algorithms implemented in MOSAIC store a ProcessingStatus column in the output database. This enables one to easily query events that were successfully processed. This is easily accomplished with the query below, which returns all columns for events that were successfully processed (ProcessingStatus=normal).

select * from metadata where ProcessingStatus='normal'

It is not always necessary to retrieve every column for events that fit a certain criteria. For example, gui-blockdepth-sec in the GUI displays a histogram of the blockade depths that match a user specified criteria. This is accomplished within the GUI by a query similar to the one shown below. There are two important differences between the query below and previous examples: i) by replacing * with BlockDepth, we only retrieve the BlockDepth column for events that meet the criteria specified after the where clause, and ii) selection criteria specified after where can be compound statements or even nested as seen in the examples below.

select BlockDepth from metadata where ProcessingStatus='normal' and ResTime > 0.2
select BlockDepth from metadata where ProcessingStatus='normal' and ResTime > 0.2
and BlockDepth between 0.1 and 0.5

Multiple columns can be retrieved from a table by providing a comma separated list of column names after the select clause. As in previous cases, only events that meet a specified criteria are returned. The results can be ordered using order. In this example we sort the results in ascending order by the AbsEventStart column.

select BlockDepth, ResTime, AbsEventStart from metadata where ProcessingStatus='normal'
order by AbsEventStart ASC

Finally, SQL allows the number of results returned to be limited using the limit clause. In this example, we limit the query results to the first 500 rows that meet our criteria.

select AbsEventStart from metadata where ProcessingStatus='normal'
order by AbsEventStart ASC limit 500

7.4. Export to CSV

../_images/dbExport.png
../_images/dbExportSQL.png