TY - JOUR
T1 - Designing an optimum genetic association study using dense SNP markers and family-based sample
AU - Gu, Chi C.
AU - Rao, D. C.
PY - 2003
Y1 - 2003
N2 - Genetic association analysis using thousands of single nucleotide polymorphism (SNP) markers has become a promising alternative to genome-wide linkage scan. Analysis based on linage-disequilibrium (LD) is more efficient because meiotic information of past generations is utilized. However, in addition to the physical distance between the disease locus and a marker locus, numerous other factors such as admixture, genetic drift, and multiple mutations can affect the observed value of LD. The effect of these factors in a genomic LD association study must be carefully analyzed to obtain an efficient study design. In the following review, we consider studies using family-based data and carefully study the effects of some of these important design factors, including the sample size, frequency of SNP markers, and marker density. For example, we conclude that (1) for reasonably frequent SNP markers, a moderately large sample of 500 families is appropriate for a moderately stringent significance level (α = 0.00009); (2) to maintain a power of 80%, maximal difference in allele frequencies between the disease gene and a SNP marker varies between 0.1 (under additive model) and 0.5 (multiplicative); (3) a map density of 10cM is appropriate only under idea scenario (moderately large sample size, equal trait/marker allele frequencies, maximum LD strength etc.). Results shown here should have practical implications to designing efficient LD association studies using dense SNP markers.
AB - Genetic association analysis using thousands of single nucleotide polymorphism (SNP) markers has become a promising alternative to genome-wide linkage scan. Analysis based on linage-disequilibrium (LD) is more efficient because meiotic information of past generations is utilized. However, in addition to the physical distance between the disease locus and a marker locus, numerous other factors such as admixture, genetic drift, and multiple mutations can affect the observed value of LD. The effect of these factors in a genomic LD association study must be carefully analyzed to obtain an efficient study design. In the following review, we consider studies using family-based data and carefully study the effects of some of these important design factors, including the sample size, frequency of SNP markers, and marker density. For example, we conclude that (1) for reasonably frequent SNP markers, a moderately large sample of 500 families is appropriate for a moderately stringent significance level (α = 0.00009); (2) to maintain a power of 80%, maximal difference in allele frequencies between the disease gene and a SNP marker varies between 0.1 (under additive model) and 0.5 (multiplicative); (3) a map density of 10cM is appropriate only under idea scenario (moderately large sample size, equal trait/marker allele frequencies, maximum LD strength etc.). Results shown here should have practical implications to designing efficient LD association studies using dense SNP markers.
KW - Complex Disease
KW - Family-Based Sample
KW - Genetic Association
KW - Linkage Disequilibrium
KW - Power Analysis
KW - Review
KW - Snp
UR - http://www.scopus.com/inward/record.url?scp=0345705295&partnerID=8YFLogxK
U2 - 10.2741/882
DO - 10.2741/882
M3 - Review article
C2 - 12456306
AN - SCOPUS:0345705295
SN - 2768-6701
VL - 8
SP - s68-s80
JO - Frontiers in Bioscience
JF - Frontiers in Bioscience
IS - SUPPL.
ER -