TY - JOUR
T1 - A marginalized two-part Beta regression model for microbiome compositional data
AU - Chai, Haitao
AU - Jiang, Hongmei
AU - Lin, Lu
AU - Liu, Lei
N1 - Publisher Copyright:
© 2018 Chai et al. http://creativecommons.org/licenses/by/4.0/.
PY - 2018/7
Y1 - 2018/7
N2 - In microbiome studies, an important goal is to detect differential abundance of microbes across clinical conditions and treatment options. However, the microbiome compositional data (quantified by relative abundance) are highly skewed, bounded in [0, 1), and often have many zeros. A two-part model is commonly used to separate zeros and positive values explicitly by two submodels: a logistic model for the probability of a specie being present in Part I, and a Beta regression model for the relative abundance conditional on the presence of the specie in Part II. However, the regression coefficients in Part II cannot provide a marginal (unconditional) interpretation of covariate effects on the microbial abundance, which is of great interest in many applications. In this paper, we propose a marginalized two-part Beta regression model which captures the zero-inflation and skewness of microbiome data and also allows investigators to examine covariate effects on the marginal (unconditional) mean. We demonstrate its practical performance using simulation studies and apply the model to a real metagenomic dataset on mouse skin microbiota. We find that under the proposed marginalized model, without loss in power, the likelihood ratio test performs better in controlling the type I error than those under conventional methods.
AB - In microbiome studies, an important goal is to detect differential abundance of microbes across clinical conditions and treatment options. However, the microbiome compositional data (quantified by relative abundance) are highly skewed, bounded in [0, 1), and often have many zeros. A two-part model is commonly used to separate zeros and positive values explicitly by two submodels: a logistic model for the probability of a specie being present in Part I, and a Beta regression model for the relative abundance conditional on the presence of the specie in Part II. However, the regression coefficients in Part II cannot provide a marginal (unconditional) interpretation of covariate effects on the microbial abundance, which is of great interest in many applications. In this paper, we propose a marginalized two-part Beta regression model which captures the zero-inflation and skewness of microbiome data and also allows investigators to examine covariate effects on the marginal (unconditional) mean. We demonstrate its practical performance using simulation studies and apply the model to a real metagenomic dataset on mouse skin microbiota. We find that under the proposed marginalized model, without loss in power, the likelihood ratio test performs better in controlling the type I error than those under conventional methods.
UR - http://www.scopus.com/inward/record.url?scp=85050995517&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1006329
DO - 10.1371/journal.pcbi.1006329
M3 - Article
C2 - 30036363
AN - SCOPUS:85050995517
SN - 1553-734X
VL - 14
JO - PLoS computational biology
JF - PLoS computational biology
IS - 7
M1 - e1006329
ER -