masskit.peptide package¶

Submodules¶

masskit.peptide.encoding module¶

masskit.peptide.encoding.calc_ion_series(ion_type, num_isotopes, cumulative_masses, arrays, peptide, mod_names, mod_positions, neutral_loss, charge_in, analysis, positions, start_offset=0, max_internal_size=7)¶

masskit.peptide.encoding.calc_ions_mz(peptide, ion_types_in, mod_names=None, mod_positions=None, analysis_annotations=False, precursor_charge=2, num_isotopes=2, max_internal_size=7)¶

calculate the mz values of an ion type default values are taken from the HCD values in https://pubs.acs.org/doi/full/10.1021/pr3007045

Parameters:

peptide – the peptide sequence
ion_types_in – tuple or array of tuple of ion type and charge
mod_names – any modifications
mod_positions – the positions of the modifications
analysis_annotations – add additional annotations useful for analyzing spectra
precursor_charge – used to filter out ion types with charge greater than the precursor
num_isotopes – number of carbon 13 isotopes to calculate

Returns:

a numpy arrays of the mz values for the ion series, ion intensities, annotations as an arrow list, precursor mass, fields used for analysing ion peaks

masskit.peptide.encoding.calc_named_ions(arrays, analysis=None, named_ion=None, precursor_mass=None, precursor_charge=None, charge_in=None, neutral_loss=None, num_isotopes=2)¶

masskit.peptide.encoding.calc_precursor_mass(peptide, mod_names=None, mod_positions=None)¶

calculate mass of modified peptide

Parameters:

peptide – the peptide
mod_names – the modification ids
mod_positions – the positions of the modifications

Returns:

the mass

masskit.peptide.encoding.calc_precursor_mz(peptide, charge, mod_names=None, mod_positions=None)¶

calculate m/z of modified peptide

Parameters:

peptide – the peptide
charge – the charge of the peptide
mod_names – the modification ids
mod_positions – the positions of the modifications

Returns:

the mass

masskit.peptide.encoding.expand_mod_string(mod_string)¶

decode modification string into site and position

Parameters:: mod_string – the standard modification string, e.g. “A” or “A0” or “$”
Returns:: tuple of site, position

masskit.peptide.encoding.mod_mass_pos(mod_positions, mod_names, i)¶

at a given pos in the sequence, find any matching modification positions in mod_positions and sum up the masses of the modifications

Parameters:

mod_positions – mod positions
mod_names – mod names
i – position in peptide

Returns:

masses of matching modifications

masskit.peptide.encoding.parse_ion_type_tuple(tuple_in, precursor_charge)¶

split ion_type tuple into ion type and neutral loss, if specified

Parameters:: tuple_in – ion type tuple
Raises:: ValueError – more than one neutral loss
Returns:: ion type, neutral loss

masskit.peptide.encoding.parse_modification_encoding(modification_encoding)¶

Takes a string containing a set of modification strings and creates a list of tuples. The tuples contain the modification name, the site, and the position of the modification. The string has the following format:

Site encoding of a modification: A-Y amino acid

which can be appended with a modification position encoding: 0 peptide N-terminus . peptide C-terminus ^ protein N-terminus $ protein C-terminus

So that ‘K.’ means lysine at the C-terminus of the peptide. The position encoding can be used separately, e.g. ‘^’ means apply to any protein N-terminus, regardless of amino acid

A list of modifications is separated by hashes: Phospho{S}#Methyl{0/I}#Carbamidomethyl#Deamidated{F^/Q/N}

An optional list of sites is specified within the {} for each modification. If there are no ‘{}’ then a default set of sites is used. Multiple sites are separated by a ‘/’.

“0” by itself implies “00” “.” by itself implies “..” “^” by itself implies “0^” “$” by itself implies “.$”

Parameters:: modification_encoding – a string containing the above format
Returns:: list of tuples, each tuple has modification name, site, and position

masskit.peptide.encoding.protonate_mass(mass, z)¶

Given a neutral mass and charge of an ion, calculate the m/z of the ion

Parameters:

mass – mass
z – charge

Returns:

m/z

masskit.peptide.spectrum_generator module¶

masskit.peptide.spectrum_generator.add_theoretical_spectra(df, theoretical_spectrum_column=None, ion_types=None, num_isotopes=2)¶

add theoretical spectra to a column

Parameters:

df – dataframe containing spectra
theoretical_spectrum_column – name of the column to hold theoretical spectra
ion_types – ion types to generate. None is default for TheoreticalPeptideSpectrum
num_isotopes – number of c-13 isotopes to calculate

masskit.peptide.spectrum_generator.create_peptide_name(peptide, precursor_charge, mod_names=None, mod_positions=None, ev=None)¶

_ create the name of a peptide spectrum

Parameters:

peptide – the peptide string
precursor_charge – the precursor charge
mod_names – list of modification names (integer)
mod_positions – position of modifications, 0 based
ev – collision energy in ev

masskit.peptide.spectrum_generator.generate_mods(peptide, mod_list, n_peptide=False, c_peptide=False, mod_probability=None)¶

Given a peptide and a list of modifications expressed as tuples, place the allowable modifications on the peptide.

Parameters:

mod_list – the list of allowed modifications, expressed as a string (see encoding.py)
peptide – the peptide
n_peptide – is the peptide at the N terminus of the protein?
c_peptide – is the peptide at the C terminus of the protein?
mod_probability – the probability of a modification at a particular site. None=1.0

Returns:

list of modification name, list of modification positions

masskit.peptide.spectrum_generator.generate_peptide_library(num=100, min_length=5, max_length=30, min_charge=1, max_charge=8, min_ev=10, max_ev=60, mod_list=None, set='train', mod_probability=0.1)¶

Generate a theoretical peptide library

Parameters:

set – which set to create, e.g. train, valid, test
num – the number of peptides
min_length – minimum length of the peptides
max_length – maximum length of the peptides
min_charge – the minimum charge of the peptides
max_charge – the maximum charge of the peptides
min_ev – the minimum eV (also used for nce)
max_ev – the maximum eV (also used for nce)
mod_list – the list of allowed modifications, expressed as a string (see encoding.py)
mod_probability – the probability of a modification at a particular site

Returns:

the dataframe

masskit.peptide package¶

Submodules¶

masskit.peptide.encoding module¶

masskit.peptide.spectrum_generator module¶

Module contents¶

Table of Contents

This Page