A marginalized two-part Beta regression model for microbiome compositional data

Haitao Chai, Hongmei Jiang, Lu Lin, Lei Liu

Research output: Contribution to journalArticlepeer-review

25 Scopus citations


In microbiome studies, an important goal is to detect differential abundance of microbes across clinical conditions and treatment options. However, the microbiome compositional data (quantified by relative abundance) are highly skewed, bounded in [0, 1), and often have many zeros. A two-part model is commonly used to separate zeros and positive values explicitly by two submodels: a logistic model for the probability of a specie being present in Part I, and a Beta regression model for the relative abundance conditional on the presence of the specie in Part II. However, the regression coefficients in Part II cannot provide a marginal (unconditional) interpretation of covariate effects on the microbial abundance, which is of great interest in many applications. In this paper, we propose a marginalized two-part Beta regression model which captures the zero-inflation and skewness of microbiome data and also allows investigators to examine covariate effects on the marginal (unconditional) mean. We demonstrate its practical performance using simulation studies and apply the model to a real metagenomic dataset on mouse skin microbiota. We find that under the proposed marginalized model, without loss in power, the likelihood ratio test performs better in controlling the type I error than those under conventional methods.

Original languageEnglish
Article numbere1006329
JournalPLoS computational biology
Issue number7
StatePublished - Jul 2018


Dive into the research topics of 'A marginalized two-part Beta regression model for microbiome compositional data'. Together they form a unique fingerprint.

Cite this