Warpgroup: Increased precision of metabolomic data processing by consensus integration bound analysis

Nathaniel G. Mahieu, Jonathan L. Spalding, Gary J. Patti

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

Motivation: Current informatic techniques for processing raw chromatography/mass spectrometry data break down under several common, non-ideal conditions. Importantly, hydrophilic liquid interaction chromatography (a key separation technology for metabolomics) produces data which are especially challenging to process. We identify three critical points of failure in current informatic workflows: compound specific drift, integration region variance, and naive missing value imputation. We implement the Warpgroup algorithm to address these challenges. Results: Warpgroup adds peak subregion detection, consensus integration bound detection, and intelligent missing value imputation steps to the conventional informatic workflow. When compared with the conventional workflow, Warpgroup made major improvements to the processed data. The coefficient of variation for peaks detected in replicate injections of a complex Escherichia Coli extract were halved (a reduction of 19%). Integration regions across samples were much more robust. Additionally, many signals lost by the conventional workflow were 'rescued' by the Warpgroup refinement, thereby resulting in greater analyte coverage in the processed data. Availability and implementation: Warpgroup is an open source R package available on GitHub at github.com/nathaniel-mahieu/warpgroup. The package includes example data and XCMS compatibility wrappers for ease of use. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: or [email protected].

Original languageEnglish
Pages (from-to)268-275
Number of pages8
JournalBioinformatics
Volume32
Issue number2
DOIs
StatePublished - Jan 15 2016

Fingerprint

Dive into the research topics of 'Warpgroup: Increased precision of metabolomic data processing by consensus integration bound analysis'. Together they form a unique fingerprint.

Cite this