{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "polish-inquiry",
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"from masskit.utils.tablemap import ArrowLibraryMap\n",
"from masskit.test_fixtures.demo_fixtures import cho_uniq_short_parquet"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "dutch-vehicle",
"metadata": {},
"source": [
"# Use of the spectrum object\n",
"Masskit is a software library for doing computations associated with mass spectrometry and a good place to start our description of Masskit is to examine the usage of the [`Spectrum`](../masskit.spectrum.html#masskit.spectrum.spectrum.Spectrum) class that creates a spectrum object. Let's begin by loading two spectra objects from a table in a file and then taking a look at a thumbnail of one of the spectra:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "economic-attitude",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:root:created chunk 1 with 100 records\n",
"INFO:root:processing batch 0 with size 100\n",
"INFO:root:created chunk 1 with 0 records\n"
]
},
{
"data": {
"image/png": "",
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"table = ArrowLibraryMap.from_parquet(cho_uniq_short_parquet())\n",
"spectrum1 = table[0]['spectrum']\n",
"spectrum2 = table[1]['spectrum']\n",
"spectrum1"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6a538c84",
"metadata": {},
"source": [
"The spectrum object contains the precursor and product ions for a spectrum, a dictionary of properties, and a variety of functions to operate on the ions.\n",
"## Accessing the spectrum ions "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d700c22b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The precursor m/z of spectrum 1 is 855.4538\n",
"The first five product ion m/z's of spectrum 1 are [143.0823 153.2575 159.0917 162.5977 169.0972]\n",
"The first five product ion intensities's of spectrum 1 are [143.0823 153.2575 159.0917 162.5977 169.0972]\n"
]
}
],
"source": [
"print('The precursor m/z of spectrum 1 is', spectrum1.precursor.mz)\n",
"print(\"The first five product ion m/z's of spectrum 1 are\", spectrum1.products.mz[0:5])\n",
"print(\"The first five product ion intensities's of spectrum 1 are\", spectrum1.products.mz[0:5])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f8b6eed4",
"metadata": {},
"source": [
"You'll note that the product ion mz's and intensities are stored in arrays, where each peak in the spectrum corresponds to a position in spectrum1.products.mz with a corresponding intensity in the same position in `spectrum1.products.mz`. The precursor m/z is stored in `spectrum1.precursor.mz`. We use arrays of numbers as modern processors work quickly on arrays of the same type. Behind the scenes, we use numpy or arrow arrays to speed computations."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "exclusive-bosnia",
"metadata": {},
"source": [
"## Spectrum properties\n",
"Each spectrum has properties associated with it, such as a name or collision energy. These are accessed as attributes of the spectrum object:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "intelligent-europe",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The name of the spectrum is AAAACALTPGPLADLAAR/2_1(4,C,CAM) and the collision energy is 46.0\n"
]
}
],
"source": [
"print(\"The name of the spectrum is\", spectrum1.name, \"and the collision energy is\", spectrum1.ev)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d52ebd46",
"metadata": {},
"source": [
"## Spectrum mass tolerance information\n",
"The spectrum object supports mass tolerances measured in both ppm's and Daltons. Mass tolerance information is kept in objects of class [`MassInfo`](../masskit.spectrum.html#masskit.spectrum.spectrum.MassInfo), and is applied to ions by using the `tolerance` array of `Spectrum` that show the mz tolerance for each ion. Optionally, the `tolerance` array values can be set individually for each ion to allow for nonstandard mass tolerances. The arrya is also used to perform quick interval arithmetic when matching spectra."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "acf0eb6b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20.0 ppm\n",
"minimum mz values [143.07943835 153.25443485 159.08851817 162.59444805 169.09381806]\n",
"maximum mz values [143.08516165 153.26056515 159.09488183 162.60095195 169.10058194]\n"
]
}
],
"source": [
"print(spectrum1.products.mass_info.tolerance, spectrum1.products.mass_info.tolerance_type)\n",
"print(\"minimum mz values\", spectrum1.products.tolerance[0:5])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "brave-height",
"metadata": {},
"source": [
"## Operations on spectra\n",
"There are a variety of operations that can be applied to spectra, including normalization, noise filtering, merging, masking, shifting, plotting, and comparison. These operations can be conveniently chained, and by default, create new spectrum objects. A list of methods can be found in [`Spectrum`](../masskit.spectrum.html#masskit.spectrum.spectrum.Spectrum). For example, to normalize the intensity to 1.0 and noise filter a spectrum, one could do:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "d7dd0b4c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(,\n",
" )"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"spectrum1b = spectrum1.norm(1.0).filter(min_intensity=0.2)\n",
"spectrum1.norm(1.0).plot(axes=plt.gca(), mirror_spectrum=spectrum1b)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cutting-mention",
"metadata": {},
"source": [
"## Operations on pairs of spectra"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "geological-renewal",
"metadata": {},
"source": [
"### Cosine score between two spectra"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "loaded-accessory",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"586.6037181010105"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spectrum1.cosine_score(spectrum2)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "amber-enhancement",
"metadata": {},
"source": [
"### Cosine score between mz subranges"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "signed-halloween",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"599.529099567551"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spectrum1.filter(min_mz=500, max_mz=1000).cosine_score(spectrum2.filter(min_mz=500, max_mz=1000))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "tutorial-oliver",
"metadata": {},
"source": [
"### Merge two spectra, first by creating a filter spectrum"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "continent-modem",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spectrum3 = spectrum1.filter(max_mz=500)\n",
"spectrum3"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "eligible-process",
"metadata": {},
"source": [
"#### Then merge with previous spectrum"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "worth-interpretation",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"image/svg+xml": [
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spectrum3.merge(spectrum2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.10 ('base')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
},
"vscode": {
"interpreter": {
"hash": "11d150ef1a59d6ee6bd3538ad9ed751649d8a614c736b8deec7e36a34a38bbb5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}