TY - JOUR
T1 - Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer than 1000 Unique Metabolites
AU - Mahieu, Nathaniel G.
AU - Patti, Gary J.
N1 - Publisher Copyright:
© 2017 American Chemical Society.
PY - 2017/10/3
Y1 - 2017/10/3
N2 - When using liquid chromatography/mass spectrometry (LC/MS) to perform untargeted metabolomics, it is now routine to detect tens of thousands of features from biological samples. Poor understanding of the data, however, has complicated interpretation and masked the number of unique metabolites actually being measured in an experiment. Here we place an upper bound on the number of unique metabolites detected in Escherichia coli samples analyzed with one untargeted metabolomics method. We first group multiple features arising from the same analyte, which we call "degenerate features", using a context-driven annotation approach. Surprisingly, this analysis revealed thousands of previously unreported degeneracies that reduced the number of unique analytes to ∼2961. We then applied an orthogonal approach to remove nonbiological features from the data using the 13C-based credentialing technology. This further reduced the number of unique analytes to less than 1000. Our 90% reduction in data is 5-fold greater than previously published studies. On the basis of the results, we propose an alternative approach to untargeted metabolomics that relies on thoroughly annotated reference data sets. To this end, we introduce the creDBle database (http://creDBle.wustl.edu), which contains accurate mass, retention time, and MS/MS fragmentation data as well as annotations of all credentialed features.
AB - When using liquid chromatography/mass spectrometry (LC/MS) to perform untargeted metabolomics, it is now routine to detect tens of thousands of features from biological samples. Poor understanding of the data, however, has complicated interpretation and masked the number of unique metabolites actually being measured in an experiment. Here we place an upper bound on the number of unique metabolites detected in Escherichia coli samples analyzed with one untargeted metabolomics method. We first group multiple features arising from the same analyte, which we call "degenerate features", using a context-driven annotation approach. Surprisingly, this analysis revealed thousands of previously unreported degeneracies that reduced the number of unique analytes to ∼2961. We then applied an orthogonal approach to remove nonbiological features from the data using the 13C-based credentialing technology. This further reduced the number of unique analytes to less than 1000. Our 90% reduction in data is 5-fold greater than previously published studies. On the basis of the results, we propose an alternative approach to untargeted metabolomics that relies on thoroughly annotated reference data sets. To this end, we introduce the creDBle database (http://creDBle.wustl.edu), which contains accurate mass, retention time, and MS/MS fragmentation data as well as annotations of all credentialed features.
UR - http://www.scopus.com/inward/record.url?scp=85030685312&partnerID=8YFLogxK
U2 - 10.1021/acs.analchem.7b02380
DO - 10.1021/acs.analchem.7b02380
M3 - Article
C2 - 28914531
AN - SCOPUS:85030685312
SN - 0003-2700
VL - 89
SP - 10397
EP - 10406
JO - Analytical Chemistry
JF - Analytical Chemistry
IS - 19
ER -