masskit.spectra package

Submodules

masskit.spectra.ions module

class masskit.spectra.ions.HiResIons(*args, **kwargs)

Bases: Ions

for containing high mass resolution ions

static cast_intensity(intensity)

cast a single mz value

static cast_mz(mz)

cast a single mz value

change_mass_info(mass_info, inplace=False, take_max=True)

given a new mass info, recalculate tolerance bins

Parameters:
  • mass_info – the MassInfo structure to change to

  • inplace – if true, change in place, otherwise return copy

  • take_max – for each bin take the maximum intensity ion, otherwise sum all ions mapping to the bin

create_tolerance()

create start and stop arrays

intersect(comparison_ions, tiebreaker=None)

find the intersections between two high resolution ion series. calls standalone function to allow use of numba

Parameters:
  • comparison_ions – the ion series to compare to

  • tiebreaker – how to deal with one to multiple matches to peaks in spectra1. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

matched peak indexes in self, matched peak indexes in comparison_ions

property starts
Returns:

the start positions for each peak bin

property stops
Returns:

the stop positions for each peak bin

property tolerance
Returns:

the mass tolerance for each peak bin

class masskit.spectra.ions.Ions(mz=None, intensity=None, stddev=None, annotations=None, mass_info: MassInfo = None, jitter=0, copy_arrays=True, tolerance=None)

Bases: ABC

base class for a series of ions

property annotations

per peak annotations

static cast_intensity(intensity)

cast a single mz value

static cast_mz(mz)

cast a single mz value

clear_and_intersect(ion2, index1, index2, tiebreaker=None)

_ if indices are not provided, clear the ions of both self and ion2 of zero intensity peaks then intersect

Parameters:
  • ion2 – comparison ions

  • index1 – intersection indices for self (can be None)

  • index2 – intersection indices for ion2 (can be None)

  • tiebreaker – how to deal with one to multiple matches to peaks in spectra1. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

_description_

clear_computed_properties()

clear properties that are lazy computed

Returns:

returns self

copy(min_mz=-1, max_mz=0, min_intensity=-1, max_intensity=0)

create filtered version of self. This is essentially a copy constructor

Parameters:
  • min_mz – minimum mz value

  • max_mz – maximum mz value. 0 = ignore

  • min_intensity – minimum intensity value

  • max_intensity – maximum intensity value. 0 = ignore

Returns:

copy

copy_annot(ion2, index1, index2)

copy annotations from ion2 to this set of ions using the matched ion indices

Parameters:
  • ion2 – the ions to compare against

  • index1 – matched ions in this set of ions

  • index2 – matched ions in ions2

cosine_score(ion2, index1=None, index2=None, mz_power=0.0, intensity_power=0.5, scale=999, skip_denom=False, tiebreaker=None)

calculate the cosine score between this set of ions and ions2

Parameters:
  • ion2 – the ions to compare against

  • index1 – matched ions in this set of ions

  • index2 – matched ions in ions2

  • mz_power – what power to raise the mz value for each peak

  • intensity_power – what power to raise the intensity for each peak

  • scale – what value to scale the score by

  • skip_denom – skip computing the denominator

  • tiebreaker – how to deal with one to multiple matches to peaks in self. mz is closest mz value, intensity is closest intensity, None is no tiebreaking

Returns:

cosine score

evenly_space(tolerance=None, take_max=True, max_mz=None, include_zeros=False, take_sqrt=False)

convert ions to product ions with evenly spaced m/z bins. The m/z bins are centered on multiples of tolerance * 2. Multiple ions that map to the same bin are either summed or the max taken of the ion intensities.

Parameters:
  • tolerance – the mass tolerance of the evenly spaced m/z bins (bin width is twice this value) in daltons

  • take_max – for each bin take the maximum intensity ion, otherwise sum all ions mapping to the bin

  • max_mz – maximum mz value, 2000 by default

  • include_zeros – fill out array including bins with zero intensity

  • take_sqrt – take the sqrt of the intensities

filter(min_mz=-1, max_mz=0, min_intensity=-1, max_intensity=0, inplace=False)

filter ions by mz and/or intensity.

Parameters:
  • min_mz – minimum mz value, exclusive

  • max_mz – maximum mz value, inclusive. 0 = ignore

  • min_intensity – minimum intensity value, exclusive

  • max_intensity – maximum intensity value, inclusive. 0 = ignore

  • inplace – do operation on current ions, otherwise create copy

Returns:

filtered copy if not inplace, otherwise current ions

half_tolerance(mz)

calculate 1/2 of the tolerance interval

Parameters:

mz – mz value

Returns:

1/2 tolerance interval

property intensity
Returns:

the intensity of the precursor. could be a numpy array

abstract intersect(comparison_ions, tiebreaker=None)
ions2array(array, channel, bin_size=1.0, precursor=0, intensity_norm=1.0, insert_mz=False, mz_norm=2000.0, rand_intensity=0.0, down_shift=0.0, channel_first=True, take_max=True, stddev_channel=None, take_sqrt=False)

fill out an array of fixed size with the ions. note that this func assumes spectra sorted by mz

Parameters:
  • array – the array to fill out

  • channel – which channel to fill out in the array

  • bin_size – the size of each bin in the array

  • precursor – if nonzero, use this value to invert the spectra by subtracting mz from this value

  • intensity_norm – value to norm the intensity

  • insert_mz – instead of putting the normalized intensity in the array, put in the normalized mz

  • mz_norm – the value to use to norm the mz values inserted

  • rand_intensity – if not 0, multiply each intensity by random value 1 +/- rand_intensity

  • down_shift – shift mz down by this value in Da

  • channel_first – channel before spectrum in input array (pytorch style). tensorflow is channel last.

  • take_max – take the maximum intensity in a bin rather the sum of peaks in a bin

  • stddev_channel – which channel contains the std dev. None means no std dev

  • take_sqrt – take the square root of the intensity

property join

returns a dictionary with information on how to join this ion set to another ion set, such as for annotation

mask(indices, inplace=False)

mask out ions that are pointed to by the indices

Parameters:
  • indices – indices of ions to screen out or numpy boolean mask

  • inplace – do operation on current ions

Returns:

masked copy if not inplace, otherwise current ions

static mask_ions(mask, return_ions)

mask out a set of ions

Parameters:
  • mask – boolean mask

  • return_ions – ions to be masked

Returns:

masked ions

property mass_type
merge(merge_ions, inplace=False)

merge another set of ions into this one.

Parameters:
  • merge_ions – the ions to add in

  • inplace – do operation on current ions

Returns:

merged copy if not inplace, otherwise current ions

property mz
Returns:

the mz of the precursor. could be a numpy array. optionally adds a jitter to the m/z values

property neutral_loss
property neutral_loss_charge
norm(max_intensity_in=999, keep_type=True, inplace=False, ord=None)

norm the intensities

Parameters:
  • max_intensity_in – the intensity of the most intense peak

  • keep_type – keep the type of the intensity array

  • inplace – do operation on current ions, otherwise create copy

  • ord – if set, normalize using norm order as in np.linalg.norm. 2 = l2

Returns:

normed copy if not inplace, otherwise current ions

num_ions()

number of ions in spectrum :return: number of ions in spectrum

parent_filter(h2o=True, inplace=False, precursor_mz=0.0, charge=None)

filter parent ions, including water losses.

Parameters:
  • h2o – filter out water losses

  • inplace – do operation on current ions, otherwise create copy

  • precursor_mz – precursor m/z

  • charge – charge of precursor

Returns:

filtered copy if not inplace, otherwise current ions

property rank

return ranks of peaks by intensity 1=most intense, rank is integer over the size of the intensity matrix

Returns:

the rank of the ions by intensity. could be a numpy array.

rank_ions()

rank the ions. intensity rank, 1=most intense, rank is integer over the size of the intensity matrix

shift_mz(shift, inplace=False)

shift the mz values of all ions by the value of shift. Negative ions are masked out

Parameters:
  • shift – value to shift mz

  • inplace – do operation on current ions

Returns:

masked copy if not inplace, otherwise current ions

property stddev
Returns:

the std dev of the intensity per peak

property tolerance
property tolerance_type
total_intensity()

total intensity of ions :return: total intensity

windowed_filter(mz_window=7, num_ions=5, inplace=False)

filter ions by examining peaks in order of intensity and filtering peaks within a window

Parameters:
  • mz_window – half size of mz_window for filtering. 0 = no filtering

  • num_ions – number of ions allowed in full mz_window

  • inplace – do operation on current ions, otherwise create copy

Returns:

filtered copy if not inplace, otherwise current ions

class masskit.spectra.ions.IonsIterator(ions)

Bases: object

iterator over ion mz and intensity

class masskit.spectra.ions.MassInfo(tolerance: float = None, tolerance_type: str = None, mass_type: str = None, neutral_loss: str = None, neutral_loss_charge: int = None, evenly_spaced=False, arrow_struct_accessor=None, arrow_struct_scalar=None)

Bases: object

information about mass measurements of an ion peak

masskit.spectra.ions.cosine_score_calc(spectrum1_mz, spectrum1_intensity, spectrum2_mz, spectrum2_intensity, index1, index2, mz_power=0.0, intensity_power=0.5, scale=999, skip_denom=False)

the Stein and Scott 94 cosine score. By convention, sqrt of score is taken and multiplied by 999. separated out from class and placed here so that can be jit compiled by numba.

Parameters:
  • spectrum1_mz – query spectrum mz

  • spectrum1_intensity – query spectrum intensity

  • spectrum2_mz – the comparison spectrum2 mz

  • spectrum2_intensity – the comparison spectrum2 intensity

  • index1 – matched ions in spectrum1. may include duplicate matches

  • index2 – matched ions in spectrum2. may include duplicate matches

  • mz_power – what power to raise the mz value for each peak

  • intensity_power – what power to raise the intensity for each peak

  • scale – what value to scale the score by

  • skip_denom – skip computing the denominator

Returns:

the cosine score

masskit.spectra.ions.dedup_matches(products1, products2, index1, index2, tiebreaker='mz', skip_nomatch=True)

given a series of indices to matched peaks in two product ion sets, get rid of duplicate matches to peaks in the first product ion set, using a tiebreaker.

Parameters:
  • products1 – first set of product ions

  • products2 – second set of product ions

  • index1 – indices into first set of product ions

  • index2 – indices into the second set of product ions

  • tiebreaker – tiebreak by ‘intensity’ or ‘mz’ of duplicate matches. ‘delete’ means don’t match either. defaults to ‘mz’

  • skip_nomatch – in the return values, skip missing matches to the first set of produc tions, defaults to True

Returns:

matches of the first product ion set, matches of the second product ion set

masskit.spectra.ions.intersect_hires(ions1_starts, ions1_stops, ions2_starts, ions2_stops)

find the intersections between two high resolution ion series

Parameters:
  • ions1_starts – start positions of the first ion series to compare

  • ions1_stops – stop positions of the first ion series to compare

  • ions2_starts – start positions of the second ion series to compare

  • ions2_stops – stop positions of the second ion series to compare

Returns:

matched peak indexes in ions1, matched peak indexes in ion2

masskit.spectra.ions.my_intersect1d(ar1, ar2)

simplified version of numpy intersect1d. Pull outside of class so it can be jit compiled by numba (numba has only experimental class support). Note: this function does not work if there are ions in each spectra with identical mz!

Parameters:
  • ar1 – mz values for one spectra

  • ar2 – mz values for another spectra

Returns:

index of matches into array 1, index of matches into array 2

masskit.spectra.ions.nce2ev(nce, precursor_mz, charge)

convert nce to ev. Equation for QE and taken from http://proteomicsnews.blogspot.com/2014/06/normalized-collision-energy-calculation.html

Parameters:
  • nce – normalized collision energy

  • precursor_mz – precursor m/z

  • charge – charge

Returns:

ev

masskit.spectra.ipython module

masskit.spectra.ipython.is_notebook()

check to see if the code is running in a jupyter notebook

Returns:

true if it is

masskit.spectra.join module

class masskit.spectra.join.Join(*args, **kwargs)

Bases: ABC

abstract do_join(tiebreaker='mz')

do the join

Parameters:

tiebreaker – how to deal with one to multiple matches. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

self

static join_2_spectra(spectra1, spectra2, tiebreaker='mz')

left join of two spectra. iterate through all peaks for spectra1 and return joined spectra2 peaks. results also include unmatched spectra1 (left) peaks

Parameters:
  • spectra1 – first spectra. all peaks included in result

  • spectra2 – second spectra

  • tiebreaker – how to deal with one to multiple matches to peaks in spectra1. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

list of peak ids from spectrum1, list of peak ids from spectrum2

static join_3_spectra(experimental_spectrum, predicted_spectrum, theoretical_spectrum, tiebreaker='mz')

Join the peaks in a single experimental spectrum to a predicted spectrum and a theoretical spectrum. The join lists returned include all experimental and predicted peaks, but only the theoretical peaks that match the experimental spectra (and not necessarily the predicted spectrum). Note that it is possible to get a join where the theoretical peak matches the experimental peak but not the predicted peak.

Parameters:
  • experimental_spectrum – the experimental spectrum

  • predicted_spectrum – the predicted spectrum

  • theoretical_spectrum – the annotated theoretical spectrum

  • tiebreaker – how to deal with one to multiple matches. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

3 lists with the peak ids. first is experimental peaks matching the theoretical peaks. Second are the predicted peaks that match the experimental peaks. Third are the theoretical peaks that match the experimental peaks. A value of None indicates no join.

static list2float32(list_in)
static list2float64(list_in)
static list2int16(list_in)
static list2uint16(list_in)
static list2uint32(list_in)
static list2uint64(list_in)
to_pandas()

output the join results as a pandas dataframe

class masskit.spectra.join.PairwiseJoin(exp_lib_map, theo_lib_map, *args, **kwargs)

Bases: Join

Join 2 sets of spectra

do_join(tiebreaker='mz')

do the join

Parameters:

tiebreaker – how to deal with one to multiple matches. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

self

class masskit.spectra.join.ThreewayJoin(exp_lib_map, pred_lib_map, theo_lib_map, *args, **kwargs)

Bases: Join

Join 3 sets of spectra. The join lists returned include all experimental and predicted peaks, but only the theoretical peaks that match the experimental spectra (and not the predicted spectrum). Note that it is possible to get a join where the theoretical peak matches the experimental peak but not the predicted peak.

do_join(tiebreaker='mz')

do the join

Parameters:

tiebreaker – how to deal with one to multiple matches. mz is closest mz value, intensity is closest intensity, None is report multiple matches

Returns:

self

masskit.spectra.spectrum_plotting module

class masskit.spectra.spectrum_plotting.AnimateSpectrumPlot

Bases: object

class used to create animated gifs of spectrum plots

add_figure(fig, close_fig=True)

use the provided figure to draw an image for one frame of an animation

Parameters:
  • fig – matplotlib figure

  • close_fig – if True, close the figure when done

create_animated_gif(file, fps=5, pause=7)

create the animated gif from the stored figures

Parameters:
  • file – the file to write the animated gif to

  • fps – how many frames per second the animation runs

  • pause – how many time to duplicate the last frame

masskit.spectra.spectrum_plotting.draw_spectrum(spectrum, fig_format, output, figsize=(4, 2))

spectrum thumbnail plotting code called by the spectrum object writes to a stream

Parameters:
  • spectrum – spectrum to plot

  • fig_format – format of the figure e.g. ‘png’

  • output – output stream

  • figsize – the size of the figure in inches

masskit.spectra.spectrum_plotting.error_bar_plot(mz_in, intensity_in, stddev_in, color, linewidth=1)

plot spectra as colored error bars

Parameters:
  • mz_in – mz values in daltons

  • intensity_in – intensity value

  • stddev_in – standard deviations from intensity_in

  • color – color of spectrum

  • alpha – alpha blending

Returns:

line collection

masskit.spectra.spectrum_plotting.error_bar_plot_lines(mz_in, intensity_in, stddev_in, color, vertical_cutoff=0.01, linewidth=1)

plot spectra as colored error bars using lines

Parameters:
  • mz_in – mz values in daltons

  • intensity_in – intensity value

  • stddev_in – standard deviations from intensity_in

  • color – color of spectrum

  • vertical_cutoff – if the intensity/max_intensity is below this value, don’t plot the vertical line

Returns:

line collection

masskit.spectra.spectrum_plotting.line_plot(mz_in, intensity_in, color, linewidth=1)

create a LineCollection for plotting a spectrum

Parameters:
  • mz_in – mz values

  • intensity_in – intensity value

  • color – color of spectrum

Returns:

line collection

masskit.spectra.spectrum_plotting.multiple_spectrum_plot(intensities, mz=None, mirror_intensities=None, dpi=100, min_mz=0, max_mz=2000, title='', subtitles=None, normalize=None, color=(0, 0, 1, 1), mirror_color=(1, 0, 0, 1))

create a spectrum plot. If subject spectrum is specified, will draw a mirror plot

Parameters:
  • intensities – spectrum to be plotted. array-like

  • mz – mz values for plot. array-like parallel to intensities

  • mirror_intensities – intensities for mirror spectrum. array-like parallel to intensities.

  • dpi – dpi of the plot

  • min_mz – minimum mz of the plot

  • max_mz – maximum mz of the plot

  • title – title of the plot

  • subtitles – an array of strings, one title for each spectrum plot

  • normalize – norm the spectra intensities to this value

  • color – color of spectrum specified as RBGA tuple

  • mirror_color – color of mirrored spectrum specified as RGBA tuple

Returns:

matplotlib figure

masskit.spectra.spectrum_plotting.normalize_intensity(intensity, normalize=999.0)

norm the spectrum to the max peak

Parameters:
  • intensity

  • normalize – value to norm the spectrum to

Returns:

masskit.spectra.spectrum_plotting.spectrum_plot(axis, mz, intensity, stddev=None, mirror_mz=None, mirror_intensity=None, mirror_stddev=None, mirror=True, title=None, xlabel='m/z', ylabel='Intensity', title_size=None, label_size=None, max_mz=None, min_mz=0, color=(0, 0, 1, 1), mirror_color=(1, 0, 0, 1), stddev_color=(0.3, 0.3, 0.3, 0.5), left_label_color=(1, 0, 0, 1), normalize=1000, vertical_cutoff=0.0, vertical_multiplier=1.1, right_label=None, left_label=None, right_label_size=None, left_label_size=None, no_xticks=False, no_yticks=False, linewidth=1)

make a spectrum plot using matplotlib. if mirror_intensity is specified, will do a mirror plot

Parameters:
  • axis – matplotlib axis

  • mz – mz values as array-like

  • intensity – intensity as array-like, parallel to mz array

  • stddev – standard deviation of the intensities

  • title – title of plot

  • xlabel – xlabel of plot

  • ylabel – ylabel of plot

  • title_size – size of title font

  • label_size – size of x and y axis label fonts

  • mirror_mz – mz values of mirror spectrum, corresponding to mirror_intensity. If none, uses mz

  • mirror_intensity – intensity of mirror spectrum as array-like, parallel to mz array. If none, don’t plot

  • mirror_stddev – standard deviation of the intensities

  • mirror – if true, mirror the plot if there are two spectra. Otherwise plot the two spectra together

  • max_mz – maximum mz to plot

  • min_mz – minimum mz to plot

  • normalize – if specified, norm the spectra to this value.

  • color – color of spectrum specified as RBGA tuple

  • mirror_color – color of mirrored spectrum specified as RGBA tuple

  • stddev_color – color of error bars

  • left_label_color – color of the left top label

  • vertical_cutoff – if the intensity/max_intensity is below this value, don’t plot the vertical line

  • vertical_multiplier – multiply times y max values to create white space

  • right_label – label for the top right of the fiture

  • left_label – label for the top left of the figure

  • right_label_size – size of label for the top right of the fiture

  • left_label_size – size of label for the top left of the figure

  • no_xticks – turn off x ticks and labels

  • no_yticks – turn off y ticks and lables

  • linewidth – width of plotted lines

Returns:

peak_collection, mirror_peak_collection sets of peaks for picking

masskit.spectra.spectrum_plotting.unsparsify_spectrum(spectrum, max_mz)

fill out array using spectrum values, placing zeros in missing mz values

Parameters:
  • spectrum – the spectrum to operate on

  • max_mz – maximum mz value

Returns:

the unsparsified array

masskit.spectra.theoretical_spectrum module

class masskit.spectra.theoretical_spectrum.TheoreticalPeptideSpectrum(peptide, ion_types=None, mod_names=None, mod_positions=None, analysis_annotations=False, num_isotopes=2, *args, **kwargs)

Bases: TheoreticalSpectrum

theoretical peptide spectrum

class masskit.spectra.theoretical_spectrum.TheoreticalSpectrum(*args, **kwargs)

Bases: Spectrum

base class to contain a theoretical spectrum

masskit.spectra.theoretical_spectrum.annotate_peptide_spectrum(spectrum, peptide=None, precursor_charge=None, ion_types=None, mod_names=None, mod_positions=None)

annotate a spectrum with theoretically calculated ions

Module contents