11. Advanced Analysis¶
We provide packaged functions for advanced analysis such as calculating the capture rate of molecules, determining the residene time, etc as described below.
11.1. Capture Rate¶
Estimate the capture rate of molecules partitioning into a nanopore.
:Created: 12/27/2015
:Author: Arvind Balijepalli <arvind.balijepalli@nist.gov>
:License: See LICENSE.TXT
:ChangeLog:
12/27/15 AB Initial version
import numpy as np
from mosaicscripts.analysis.kinetics import CaptureRate
11.1.1. Wrapper Function to Estimate the Capture Rate¶
The capture rate can be estimated directly by calling the
CaptureRate
function in mosaicscripts.analysis.kinetics
. The
function returns a list with two elements: the capture rate
(s:math:^{-1}), and the standard error of the capture rate
(s:math:^{-1}).
np.round(
CaptureRate(
"../data/eventMD-P28-bin.sqlite",
"select AbsEventStart from metadata where ProcessingStatus='normal' and ResTime > 0.02 order by AbsEventStart ASC"
),
decimals=1
)
array([ 27.9, 0.2])
11.1.2. Capture Rate Details¶
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
from mosaicscripts.analysis.kinetics import query1Col
import mosaicscripts.plots.mplformat as mplformat
from mosaic.utilities.fit_funcs import singleExponential
mplformat.update_rcParams()
Continue reading to dig deeper into how the capture rate is estimated
within the CaptureRate
function.
The first step is to read in the start times for each event. This is
easily done with a query to the MOSAIC database as shown below. The
start times are stored in the AbsEventStart
column. We limit the
events we use to estimate the capture rate to ones that were
successfully fit (ProcessingStatus
=’normal’) and those whose
residence times (ResTime
) in the pore are longer than 20
\(\mu\)s.
Finally, we sort the AbsEventStart
to ensure the event start times
are in ascending order.
start_times=query1Col(
"../data/eventMD-P28-bin.sqlite",
"select AbsEventStart from metadata where ProcessingStatus='normal' and ResTime > 0.02 order by AbsEventStart ASC"
)
Next, we calculate the arrival times, i.e. the time between the start of
successive events. This is done with the Numpy
diff function. Note
that AbsEventStart
is stored in milliseconds within the database.
Here, we also convert the arrival times to seconds.
arrival_times=np.diff(start_times)/1000.
The partitioning of molecules into the pore is a stochastic process.
There are however a couple properties related to stochastic process that
we will leverage that makes the estimation of the capture rate more
robust. With randomly occuring events that have some mean rate, the
number of events scales linearly with time. Therefore, the distribution
of these events follows a single exponential form. We can easily test
this by calculating the probability density function (PDF) using the
Numpy
histogram function. Note that the density
=True
argument normalizes the histogram resulting in a PDF.
density,bins=np.histogram(arrival_times, bins=100, density=True)
Plot the resulting PDF with Matplotlib
to verify the distribution.
Sure enough on a semilog scale, the resulting distribution appears
linear suggesting an exponential form.
plt.semilogy(
bins[:len(density)], density,
linestyle='None',
marker='o',
markersize=8,
markeredgecolor='blue',
markerfacecolor='None'
)
plt.xlim(0.005,0.3)
plt.ylim(0.006,25)
plt.xticks([0.05,0.15,0.25])
plt.yticks([1e-2,0.1,1,1e1])
plt.axes().set_xlabel("Arrival Times (s)")
plt.axes().set_ylabel("Density (s$^{-1}$)")
plt.show()
Next we fit the PDF to a single exponential function of the form
\(a\ e^{-t/\tau}\), where a is a scaling factor and \(\tau\) is
the mean time of the distribution (with a rate of 1/\(\tau\)).
This is accomplished with the curve_fit
function within Scipy
.
popt, pcov = curve_fit(singleExponential, bins[:len(density)], density, p0=[1, np.mean(arrival_times)])
We then visually check the fit, by superimposing the resulting fit function over the PDF.
plt.semilogy(
bins[:len(density)], density,
linestyle='None',
marker='o',
markersize=8,
markeredgecolor='blue',
markerfacecolor='None'
)
plt.semilogy(
np.arange(0.001,0.4,0.02),
singleExponential(np.arange(0.001,0.4,0.02), *popt),
color='blue'
)
plt.xlim(0.005,0.3)
plt.ylim(0.006,25)
plt.xticks([0.05,0.15,0.25])
plt.yticks([1e-2,0.1,1,1e1])
plt.axes().set_xlabel("Arrival Times (s)")
plt.axes().set_ylabel("Density (s$^{-1}$)")
plt.show()
Finally, we can extract the capture rate (1/\(\tau\)) from the optimal fit parameters.
np.round([1/popt[1]], decimals=1 )
array([ 27.9])