TY - JOUR
T1 - Warpgroup
T2 - Increased precision of metabolomic data processing by consensus integration bound analysis
AU - Mahieu, Nathaniel G.
AU - Spalding, Jonathan L.
AU - Patti, Gary J.
N1 - Publisher Copyright:
© 2015 The Author 2015. Published by Oxford University Press. All rights reserved.
PY - 2016/1/15
Y1 - 2016/1/15
N2 - Motivation: Current informatic techniques for processing raw chromatography/mass spectrometry data break down under several common, non-ideal conditions. Importantly, hydrophilic liquid interaction chromatography (a key separation technology for metabolomics) produces data which are especially challenging to process. We identify three critical points of failure in current informatic workflows: compound specific drift, integration region variance, and naive missing value imputation. We implement the Warpgroup algorithm to address these challenges. Results: Warpgroup adds peak subregion detection, consensus integration bound detection, and intelligent missing value imputation steps to the conventional informatic workflow. When compared with the conventional workflow, Warpgroup made major improvements to the processed data. The coefficient of variation for peaks detected in replicate injections of a complex Escherichia Coli extract were halved (a reduction of 19%). Integration regions across samples were much more robust. Additionally, many signals lost by the conventional workflow were 'rescued' by the Warpgroup refinement, thereby resulting in greater analyte coverage in the processed data. Availability and implementation: Warpgroup is an open source R package available on GitHub at github.com/nathaniel-mahieu/warpgroup. The package includes example data and XCMS compatibility wrappers for ease of use. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: or [email protected].
AB - Motivation: Current informatic techniques for processing raw chromatography/mass spectrometry data break down under several common, non-ideal conditions. Importantly, hydrophilic liquid interaction chromatography (a key separation technology for metabolomics) produces data which are especially challenging to process. We identify three critical points of failure in current informatic workflows: compound specific drift, integration region variance, and naive missing value imputation. We implement the Warpgroup algorithm to address these challenges. Results: Warpgroup adds peak subregion detection, consensus integration bound detection, and intelligent missing value imputation steps to the conventional informatic workflow. When compared with the conventional workflow, Warpgroup made major improvements to the processed data. The coefficient of variation for peaks detected in replicate injections of a complex Escherichia Coli extract were halved (a reduction of 19%). Integration regions across samples were much more robust. Additionally, many signals lost by the conventional workflow were 'rescued' by the Warpgroup refinement, thereby resulting in greater analyte coverage in the processed data. Availability and implementation: Warpgroup is an open source R package available on GitHub at github.com/nathaniel-mahieu/warpgroup. The package includes example data and XCMS compatibility wrappers for ease of use. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: or [email protected].
UR - http://www.scopus.com/inward/record.url?scp=84949454389&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btv564
DO - 10.1093/bioinformatics/btv564
M3 - Article
C2 - 26424859
AN - SCOPUS:84949454389
SN - 1367-4803
VL - 32
SP - 268
EP - 275
JO - Bioinformatics
JF - Bioinformatics
IS - 2
ER -