Lecture
Big Data–Driven Metabolomics MS/MS Spectra Annotation
- at -
- ICM Saal 2
- Type: Lecture
Lecture description
Y. Li, China, O. Fiehn, United States
In a typical untargeted metabolomics mass spectrometry experiment, several thousand MS/MS spectra can be collected. More than 1 billion spectra are available in public repositories such as MassIVE/GNPS, Metabolomics Workbench, and MetaboLights.
However, limited biological and chemical metadata are associated with these MS/MS spectra repositories. By comparing experimental MS/MS spectra against public repositories, we can determine in which species, conditions, and organs molecules have been detected and observe their variations across different studies. Such metaanalysis is critical for understanding the biological relevance of molecules and assists in structural identification.
We developed, implemented, and evaluated Flash Entropy Search to calculate similarity matching of millions of accurate-mass MS/MS spectra in <10 ms. One billion spectra can be searched in <2 s, not only for classic identity searches but also including neutral-loss, open, and hybrid searches. Flash Entropy Search enables ultrafast computing on a big-data scale for every laboratory. Extending MS/MS
similarity beyond classic identity searches yields structure classes and chemical substructures for most unidentified compounds. [1]
Next, we collected publicly available MS/MS spectra from repositories such as MassIVE/GNPS, Metabolomics Workbench, and MetaboLights, along with MS/MS data generated at the UC Davis West Coast Metabolomics Center. We curated over one
billion MS/MS spectra and indexed them using Flash Entropy Search algorithms. Subsequently, we developed a public website, Mass.Wiki, which users can access to annotate standardized spectra, or users can upload their own spectra to search against our extensive database of public MS/MS spectra. Through Mass.Wiki, users also obtain information on biological relevance, including details on the species and organs in which these spectra were detected. We present results and discuss examples of how these resources can be fully utilized.
Literature:
[1] Yuanyue Li & Oliver Fiehn. Flash entropy search to query all mass spectral libraries in real time. Nature Methods 20, 1475–1478 (2023).
In a typical untargeted metabolomics mass spectrometry experiment, several thousand MS/MS spectra can be collected. More than 1 billion spectra are available in public repositories such as MassIVE/GNPS, Metabolomics Workbench, and MetaboLights.
However, limited biological and chemical metadata are associated with these MS/MS spectra repositories. By comparing experimental MS/MS spectra against public repositories, we can determine in which species, conditions, and organs molecules have been detected and observe their variations across different studies. Such metaanalysis is critical for understanding the biological relevance of molecules and assists in structural identification.
We developed, implemented, and evaluated Flash Entropy Search to calculate similarity matching of millions of accurate-mass MS/MS spectra in <10 ms. One billion spectra can be searched in <2 s, not only for classic identity searches but also including neutral-loss, open, and hybrid searches. Flash Entropy Search enables ultrafast computing on a big-data scale for every laboratory. Extending MS/MS
similarity beyond classic identity searches yields structure classes and chemical substructures for most unidentified compounds. [1]
Next, we collected publicly available MS/MS spectra from repositories such as MassIVE/GNPS, Metabolomics Workbench, and MetaboLights, along with MS/MS data generated at the UC Davis West Coast Metabolomics Center. We curated over one
billion MS/MS spectra and indexed them using Flash Entropy Search algorithms. Subsequently, we developed a public website, Mass.Wiki, which users can access to annotate standardized spectra, or users can upload their own spectra to search against our extensive database of public MS/MS spectra. Through Mass.Wiki, users also obtain information on biological relevance, including details on the species and organs in which these spectra were detected. We present results and discuss examples of how these resources can be fully utilized.
Literature:
[1] Yuanyue Li & Oliver Fiehn. Flash entropy search to query all mass spectral libraries in real time. Nature Methods 20, 1475–1478 (2023).