MATS Summary

In multi-spectrum fitting, there are a collection of spectra modeled by the same line-by-line spectroscopic parameters, but each spectrum might vary in pressure, temperature, and sample composition.

The MATS program is based on Spectrum objects, which are defined not only by their wavenumber and absorption data, but also information on the spectrum pressure, temperature, baseline characteristics, and sample composition. In addition to utilizing real spectra, MATS has a simulate_spectrum() function, which returns a spectrum object calculated from input simulation parameters. This is useful for performing error analysis in the same framework as primary data analysis by simply switching from experimental to simulated Spectrum objects.

These objects are combined to form a Dataset object, which is the collection of spectra that are being analyzed together in the multi-spectrum analysis.

The analysis of spectra in MATS requires an inital spectroscopic linelist. Details about the format of this linelist and how to generate it using the HITRAN Application Programming Interface are outlined on the Generating Parameter Line lists page. A few example line lists have been provided in the Line list folder , which can be accessed using the LoadLineListData() helper function. It should be noted that these linelists are provided for use with the examples and to provide an example of line list formatting. These linelists should not be used as reference data.

There are two files that contain parameters that are fit in this model, one for spectrum dependent parameters (polynomial baseline parameters, etalons, sample composition, and x-shift term) and the other for line-by-line spectroscopic parameters that are common across all spectra. These files are saved as .csv files with a column for each parameter and with rows corresponding to either the spectrum number or spectral line number. In addition to the columns containing the values for the fit parameters, there are two additional columns for each fittable parameter called param_vary and param_err. The param_vary column is a boolean (True/False) flag that is toggled to indicate whether a given parameter will be varied in the fit. The param_err column will be set to zero initially and replaced with the standard error for the parameter determined by the fit results. Calls of the Generate_FitParam_File class not only make these input files, but also set the line shape and define whether a parameter should be varied in the fit and if a parameter should be constrained across all spectra or allowed to vary by spectrum.

Finally, the Fit_DataSet class fits the spectra. Additionally, it allows the user to impose constraints on the parameters (min and max values), impose convergence criteria, update background and parameter line lists, and plot fit results.

Below is the sparse documentation for each of the classes and main functions in the MATS project with links to the full documentation provided.

Spectrum Class and Objects

Spectrum(filename[, molefraction, ...])

Spectrum class provides all information describing experimental or simulated spectrum.

simulate_spectrum(parameter_linelist[, ...])

Generates a synthetic spectrum, where the output is a spectrum object that can be used in MATS classes.

Spectrum.calculate_QF()

Calculates the quality of fit factor (QF) for a spectrum - QF = (maximum alpha - minimum alpha) / std(residuals).

Spectrum.fft_spectrum()

Takes the FFT of the residuals of the spectrum, generates a plot of frequency (cm-1) versus amplitude (ppm/cm), and prints a dataframe with the 20 highest amplitude frequencies with the FFT frequency (period), amplitude, FFT phase, and frequency (cm-1).

Spectrum.plot_freq_tau()

Generates a plot of tau (us) as a function of frequency (MHz).

Spectrum.plot_model_residuals()

Generates a plot of the alpha and model (ppm/cm) as a function of wavenumber (cm-1) and on lower plot shows the residuals (ppm/cm) as a function of wavenumber (cm-1).

Spectrum.plot_wave_alpha()

Generates a plot of alpha (ppm/cm) as a function of wavenumber (cm-1).

Spectrum.save_spectrum_info([save_file])

Saves spectrum information to a pandas dataframe with option to also save as as a csv file.

Spectrum.segment_wave_alpha()

Defines the wavenumber, alpha, and indices of spectrum that correspond to a given spectrum segment.

Line-by-line Model

The line-by-line model is based on the HTP code provided in the HITRAN Application Programming Interface (HAPI). For the most part the conventions and definitions used by HITRAN are used in the MATS program. However, for some of the advanced line profile parameters the naming convention and temperature dependence is different. In the sections below, the temperature and pressure dependence of the various parameters is outlined for clarity. Additionally, MATS uses the CODATA values for calculations.

The Hartmann-Tran profile limiting cases that correspond to several commonly used line profiles. These limiting cases are achieved by setting line shape parameters equal to 0. The list below indicates the parameters that are not fixed equal to zero in each of the HTP limiting case line shapes. For more information about the HTP see the following references: Recommended isolated-line profile for representing high-resolution spectroscopic transitions (IUPAC Technical Report) and An isolated line-shape model to go beyond the Voigt profile in spectroscopic databases and radiative transfer codes

Voigt Profile (VP): \(\Gamma_{D}, \Gamma_{0}, \Delta_{0}\)

Nelkin-Ghatak Profile (NGP): \(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \nu_{VC}\)

speed-dependent Voigt Profile (SDVP): \(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \Gamma_{2}, \Delta_{2}\)

speed-dependent Nelkin-Ghatak Profile (SDNGP): \(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \nu_{VC}, \Gamma_{2}, \Delta_{2}\)

Hartmann-Tran (HTP): \(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \nu_{VC}, \Gamma_{2}, \Delta_{2}, \eta\)

Line Intensity

The line intensity for each line at the experimental temperature is calculated using the EnvironmentDependency_Intensity function in HAPI. This function takes as arguments the line intensity at 296 K (\(S(T_{ref})\)), the experimental temperature (\(T\)), the reference temperature 296 K (\(T_{ref}\)), the partition function at the experimental temperature (\(Q(T)\)), the partition function at the reference temperature (\(Q(T_{ref})\)), the lower state energy (\(E"\)), and the line center ((\(\nu\))), and constant (\(c2 = hc/k\)). The partition functions are calculated using TIPS with the option for the user to select between the three versions available in HAPI TIPS2011, TIPS2017, and TIPS2021. Constants are defined by CODATA values

\[S(T) = S(T_{ref}) \frac{Q(T_{ref})}{Q(T)}\frac{e^{-c2E"/T}}{e^{-c2E"/T_{ref}}} \frac{1 - e^{-c2\nu / T}}{1 - e^{-c2\nu / T_{ref}}}\]

Doppler Broadening

In MATS, the doppler broadening (\(\Gamma_{D}\))is not a floatable parameter and is calculated based on the experimental temperature (\(T\)), line center (\(\nu\)), and molecular mass (\(m\)). Constants are defined by CODATA values. The doppler width is calculated as:

\[ \begin{align}\begin{aligned}\Gamma_{D} = \sqrt{\frac{2kT \cdot ln(2)}{cMassMol\cdot mc^{2}}} \cdot \nu\\k = 1.380648813 x 10^{-16} erg K^{-1}\\cMassMol = 1.66053873x 10^{-24} mol\end{aligned}\end{align} \]

Collisional Half-Width

The collisional half-width (\(\Gamma_{0}\)) is a function of both the experimental pressure (\(P\)) and temperature (\(T\)) referenced to \(P_{ref}\) (1 atm) and \(T_{ref}\) (296 K). The contributions from each diluent (k) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble collisional broadening. The temperature dependence is modeled as a power law, where \(n\) is the temperature exponent for the collisional width. The collisional half-width for each line at experimental temperature and pressure can be represented as:

\[\Gamma_{0} (P,T) = \sum abun_{k} (\Gamma_{0}^{k} * \frac{P}{P_{ref}} * (\frac{T_{ref}}{T})^{n_{\Gamma_{0}^{k}}})\]

In the MATS nomenclature, the collisional half-width is called gamma0_diluent and the temperature dependence of the collisional half-width is called n_gamma0_diluent.

Pressure Shift

Just like the collisional half-width, the pressure shift (\(\Delta_{0}\)) is a function of both the experimental pressure (\(P\)) and temperature (\(T\)) referenced to \(P_{ref}\) (1 atm) and \(T_{ref}\) (296 K). The contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble pressure shift. Unlike the collisional half-width, the pressure shift has a linear temperature dependence, where \(n\) represents the temperature dependence of the pressure shift. The pressure shift for each line at experimental pressure and temperature can be represented as:

\[\Delta_{0} (P,T) = \sum abun_{k} (\Delta_{0}^{k} + n_{\Delta_{0}^{k}}\cdot (T - T_{ref}) )\frac{P}{P_{ref}}\]

In the MATS nomenclature, the pressure shift is called delta0_diluent and the temperature dependence of the pressure shift is called n_delta0_diluent.

Speed-Dependent Broadening

The speed-dependent mechanism accounts for the speed-dependence of relaxation rates and is parameterized in the speed-dependent Voigt (SDVP), speed-dependent Nelkin-Ghatak (SDNGP), and Hartmann-Tran (HTP) profiles. The speed-dependent broadening in MATS is tabulated as the ratio \(a{w} = \frac{\Gamma_{2}}{\Gamma_{0}}\), but the actual fitted parameter is \(\Gamma_{2}\). The temperature dependence of the speed-dependent broadening is a power law dependence on \(\Gamma_{2}\). Currently in HITRAN, it is assumed that the \(n_{\Gamma_{0}} = n_{\Gamma_{2}}\), such that \(a_{w}\) is assumed to be temperature independent. The introduction of \(n_{\Gamma_{2}}\) as a parameter in MATS allows for the option of this assumption to imposed, but the flexibility to explore non-equivalent temperature dependences between the speed-dependent and collisional broadening terms. The contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble speed-dependent broadening.

\[\Gamma_{2} (P,T) = \sum abun_{k} (a_{w}^{k} *\Gamma_{0}^{k} * \frac{P}{P_{ref}} * (\frac{T_{ref}}{T})^{n_{\Gamma_{2}^{k}}})\]

In the MATS nomenclature, the ratio of the speed-dependent broadening to the collisional broadening (\(a_{w}\)) is called SD_gamma_diluent and the temperature dependence of the speed-dependent broadening is called n_gamma2_diluent. The difference in the naming structure (SD_gamma vs gamma2) is chosen to emphasize the difference between the speed-dependent width being parameterized as a ratio versus as an absolute value.

Speed-Dependent Shifting

The speed-dependent mechanism accounts for the speed-dependence of relaxation rates and is parameterized in the speed-dependent Voigt (SDVP), speed-dependent Nelkin-Ghatak (SDNGP), and Hartmann-Tran (HTP) profiles. The speed-dependent shift in MATS is tabulated as the ratio \(a{s} = \frac{\Delta_{2}}{\Delta_{0}}\), but the actual fitted parameter is \(\Delta_{2}\). The temperature dependence of the speed-dependent shift is modeled with a linear dependence. Currently, the temperature dependence of the speed-dependent shift is not parameterized in HITRAN. The contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble speed-dependent shift.

\[\Delta_{2} (P,T) = \sum abun_{k} (a_{s} \cdot \Delta_{0}^{k} + n_{\Delta_{2}^{k}}\cdot (T - T_{ref}) )\frac{P}{P_{ref}}\]

In MATS nomenclature, the ratio of the speed-dependent shift to the pressure shift (\(a_{s}\)) is called SD_shift_diluent and the temperature dependence of the speed-dependent shift is called n_delta2_diluent. The difference in the naming structure (SD_delta vs delta2) is chosen to emphasize the difference between the speed-dependent shift being parameterized as a ratio versus as an absolute value.

Dicke Narrowing

The Dicke narrowing mechanism models collisional induced velocity changes and is parameterized in the Nelkin-Ghatak (NGP), speed-dependent Nelkin-Ghatak (SDNGP), and Hartmann-Tran (HTP) profiles by the term \(\nu_{VC}\). The temperature dependence is modeled as a power law, where \(n\) represents the temperature dependence of the Dicke narrowing term. If the Dicke narrowing is assumed to behave like the diffusion coefficient, then the temperature dependence theoretically should be 1. The contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble Dicke narrowing.

\[\nu_{VC} (P,T) = \sum_{k=i} abun_{k} (\nu_{VC}^{k} * \frac{P}{P_{ref}} * (\frac{T_{ref}}{T})^{n_{\nu{VC}^{k}}})\]

In MATS nomenclature, the Dicke narrowing is referred to as nuVC_diluent and the temperature exponent is n_nuVC_diluent. This differs from the naming convention in HAPI, which changes based on the origin of the Dicke narrowing term (Galatry profile versus HTP). For simplicity, MATS has adopted a self-consistent naming convention.

Correlation Parameter

The correlation parameter (\(\eta\)) models the correlation between velocity and rotation state changes due to collisions and is only parameterized in the Hartmann-Tran profile (HTP). Currently, MATS has no temperature or pressure dependence associated with the correlation parameter. However, contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble correlation parameter.

\[\eta (k) = \sum abun_{k} \cdot \eta\]

In MATS nomenclature, the correlation parameter is referred to as eta_diluent.

Line Mixing

The first order Rosenkranz line mixing (\(Y\)) can be calculated from the imaginary portion of any of the HTP derivative line profiles. The temperature dependence is modeled as a power law, where \(n\) represents the temperature dependence of the Rosenkranz line-mixing term. The contributions from each diluent (\(k\)) can be scaled by the diluent composition fraction (\(abun\)) and summed to model the ensemble line mixing.

\[Y(P,T) = \sum_{k = i} abun_{k} (Y^{k} * \frac{P}{P_{ref}} * (\frac{T_{ref}}{T})^{n_{Y^{k}}}\]

The line mixing is implemented as:

\[\alpha = I * (Re{HTP(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \Gamma_{2}, \Delta_{2}, \nu_{VC}, \eta, \nu)} + Y*Im{HTP(\Gamma_{D}, \Gamma_{0}, \Delta_{0}, \Gamma_{2}, \Delta_{2}, \nu_{VC}, \eta, \nu)})\]

In MATS nomenclature, the line mixing parameter is referred to as y_diluent and the temperature exponent is n_y_diluent. This differs from the naming convention in HAPI, where the parameter name contains information about the corresponding line shape. MATS does not contain this information in the parameter name.

Line-by-line Models

HTP_from_DF_select(linelist, waves[, ...])

Calculates the absorbance (ppm/cm) based on input line list, wavenumbers, and spectrum environmental parameters.

HTP_wBeta_from_DF_select(linelist, waves[, ...])

Calculates the absorbance (ppm/cm) based on input line list, wavenumbers, and spectrum environmental parameters with capability of incorporating the beta correction to the Dicke Narrowing proposed in Analytical-function correction to the Hartmann–Tran profile for more reliable representation of the Dicke-narrowed molecular spectra.

Dataset Class

Dataset(spectra, dataset_name, param_linelist)

Combines spectrum objects into a Dataset object to enable multi-spectrum fitting.

Dataset.generate_baseline_paramlist()

Generates a csv file called dataset_name + _baseline_paramlist, which will be used to generate another csv file that is used for fitting spectrum dependent parameters with columns for spectrum number, segment number, x_shift, concentration for each molecule in the dataset, baseline terms (a = 0th order term, b = 1st order, etc), and etalon terms (set an amplitude, period, and phase for the number of etalons listed for each spectrum in the Dataset).

Dataset.generate_summary_file([save_file])

Generates a summary file combining spectral information from all spectra in the Dataset.

Dataset.get_spectra_extremes()

Dataset.get_spectrum_extremes()

Gets the minimum and maximum wavenumber for the entire Dataset.

Dataset.average_QF()

Calculates the Average QF from all spectra.

Dataset.plot_model_residuals()

Generates a plot showing both the model and experimental data as a function of wavenumber in the main plot with a subplot showing the residuals as function of wavenumber.

Generate FitParam File Class

Generate_FitParam_File(dataset, ...[, ...])

Class generates the parameter files used in fitting and sets up fit settings.

Generate_FitParam_File.generate_fit_baseline_linelist([...])

Generates the baseline line list used in fitting and updates the fitting booleans to desired settings.

Generate_FitParam_File.generate_fit_param_linelist_from_linelist([...])

Generates the parameter line list used in fitting and updates the fitting booleans to desired settings.

Fit DataSet Class

Fit_DataSet(dataset, base_linelist_file, ...)

Provides the fitting functionality for a Dataset.

Fit_DataSet.constrained_baseline(params[, ...])

Imposes baseline constraints when using multiple segments per spectrum, ie all baseline parameters can be the same across the entire spectrum except for the etalon phase, which is allowed to vary per segment.

Fit_DataSet.fit_data(params[, wing_cutoff, ...])

Uses the lmfit minimizer to do the fitting through the simulation model function.

Fit_DataSet.generate_params()

Generates the lmfit parameter object that will be used in fitting.

Fit_DataSet.residual_analysis(result[, ...])

Updates the model and residual arrays in each spectrum object with the results of the fit and gives the option of generating the combined absorption and residual plot for each spectrum.

Fit_DataSet.simulation_model(params[, ...])

This is the model used for fitting that includes baseline, resonant absorption, and CIA models.

Fit_DataSet.update_params(result[, ...])

Updates the baseline and line parameter files based on fit results with the option to write over the file (default) or save as a new file and updates baseline values in the spectrum objects.

Support Modules

etalon(x, amp, period, phase)

Etalon definition

molecularMass(M, I[, isotope_list])

molecular mass look-up based on the HAPI definition adapted to allow used to specify ISO list INPUT PARAMETERS: M: HITRAN molecule number I: HITRAN isotopologue number OUTPUT PARAMETERS: MolMass: molecular mass --- DESCRIPTION: Return molecular mass of HITRAN isotolopogue.

isotope_list_molecules_isotopes([isotope_list])

The HITRAN style isotope list in the format (M,I), this function creates a dictionary from this with M as the keys and lists of I as values.

add_to_HITRANstyle_isotope_list([...])

Allows for used to add to an existing isotope line list in the HITRAN format

LoadLineListData([paths])

Helper class to read in supplied LineList DataFrames