Inflation of type I error occurs when conducting a large number of statistical tests in genome-wide linkage scans. Stringent α-levels protect against the high numbers of expected false positives but at the cost of more false negatives. A more balanced tradeoff is provided by the theory of sequential analysis, which can be used in a genome scan even when the data are collected using a fixed-sample design. Sequential tests allow complete, simultaneous control of both the type I and II errors of each individual test while using the smallest possible sample size for analysis. For fixed samples, the excess N 'saved' can be used in a confirmatory, replication phase of the original findings. Using the theory of sequential multiple decision procedures [Bechhoffer et al., 1968], we can replace the series of individual marker tests with a new single, simultaneous genome-wide test that has multiple possible outcomes and partitions all markers into two subsets: the 'signal' versus the 'noise,' with an a priori specifiable genome-wide error rate. These tests are demonstrated for the Haseman-Elston approach, are applied to real data, and are contrasted with traditional fixed-sampling tests in Monte Carlo simulations of repeated genome-wide scans. The method allows efficient identification of the true signals in a genome scan, uses the smallest possible sample sizes, saves the excess to confirm those findings, controls both types of error, and provides one elegant solution to the debate over the best way to balance between false positives and negatives in genome scans. (C) 2000 Wiley-Liss, Inc.
|Number of pages
|Published - 2000
- Genome-wide scans
- Sequential multiple decision procedures
- Sequential probability ratio tests