TY - JOUR
T1 - Noise reduction in genome-wide perturbation screens using linear mixed-effect models
AU - Yu, Danni
AU - Danku, John
AU - Baxter, Ivan
AU - Kim, Sungjin
AU - Vatamaniuk, Olena K.
AU - Salt, David E.
AU - Vitek, Olga
N1 - Funding Information:
Funding: NSF BIO/DBI 1054826 award to DR O.V.; NSF (DBI-0606193) and NIH (4R33DK070290-02) awards to Dr D.E.S.; NSF (MCB-0923731) award to Dr O.K.V.
PY - 2011/8
Y1 - 2011/8
N2 - Motivation: High-throughput perturbation screens measure the phenotypes of thousands of biological samples under various conditions. The phenotypes measured in the screens are subject to substantial biological and technical variation. At the same time, in order to enable high throughput, it is often impossible to include a large number of replicates, and to randomize their order throughout the screens. Distinguishing true changes in the phenotype from stochastic variation in such experimental designs is extremely challenging, and requires adequate statistical methodology. Results: We propose a statistical modeling framework that is based on experimental designs with at least two controls profiled throughout the experiment, and a normalization and variance estimation procedure with linear mixed-effects models. We evaluate the framework using three comprehensive screens of Saccharomyces cerevisiae, which involve 4940 single-gene knockout haploid mutants, 1127 single-gene knock-out diploid mutants and 5798 single-gene overexpression haploid strains. We show that the proposed approach (i) can be used in conjunction with practical experimental designs; (ii) allows extensions to alternative experimental workflows; (iii) enables a sensitive discovery of biologically meaningful changes; and (iv) strongly outperforms the existing noise reduction procedures.
AB - Motivation: High-throughput perturbation screens measure the phenotypes of thousands of biological samples under various conditions. The phenotypes measured in the screens are subject to substantial biological and technical variation. At the same time, in order to enable high throughput, it is often impossible to include a large number of replicates, and to randomize their order throughout the screens. Distinguishing true changes in the phenotype from stochastic variation in such experimental designs is extremely challenging, and requires adequate statistical methodology. Results: We propose a statistical modeling framework that is based on experimental designs with at least two controls profiled throughout the experiment, and a normalization and variance estimation procedure with linear mixed-effects models. We evaluate the framework using three comprehensive screens of Saccharomyces cerevisiae, which involve 4940 single-gene knockout haploid mutants, 1127 single-gene knock-out diploid mutants and 5798 single-gene overexpression haploid strains. We show that the proposed approach (i) can be used in conjunction with practical experimental designs; (ii) allows extensions to alternative experimental workflows; (iii) enables a sensitive discovery of biologically meaningful changes; and (iv) strongly outperforms the existing noise reduction procedures.
UR - http://www.scopus.com/inward/record.url?scp=79961180273&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btr359
DO - 10.1093/bioinformatics/btr359
M3 - Article
C2 - 21685046
AN - SCOPUS:79961180273
SN - 1367-4803
VL - 27
SP - 2173
EP - 2180
JO - Bioinformatics
JF - Bioinformatics
IS - 16
M1 - btr359
ER -