TY - JOUR
T1 - Measuring marker information content by the ambiguity of block boundaries observed in dense SNP data
AU - Gu, C. Charles
AU - Yu, Kai
AU - Boerwinkle, Eric
PY - 2007/1
Y1 - 2007/1
N2 - Recent studies have noted that the boundary of common haplotype blocks in hapmap constructions involve a certain degree of ambiguity, and so do the resulting "tagSNPs". Here, we report how to address this issue at the level of individual SNP markers. We introduce a measure called the marker ambiguity score (MAS) , and evaluate its utility by simulation studies based on a real dataset of 2949 SNPs spanning a region of 56.1M bp. We show that the MAS method can be used to assess the level of boundary ambiguity caused by varying ethnic background, sample sizes for hapmap construction, and disease aggregation. We find a striking difference in overall patterns of block boundary distributions in two ethnic groups (blacks and whites), and subtle changes in block structures that agree with the evolutionary history of the two populations. Our analyses suggest that a sample size of 200 or more subjects is probably needed for "stable" hapmap constructions. In addition, we demonstrate that there are subtle changes in block boundaries in hapmaps constructed in disease populations versus normal controls. This approach can quantify the information content of individual markers in the context of highly dense SNP data, which may have important implications in designing efficient genome-wide association mapping projects.
AB - Recent studies have noted that the boundary of common haplotype blocks in hapmap constructions involve a certain degree of ambiguity, and so do the resulting "tagSNPs". Here, we report how to address this issue at the level of individual SNP markers. We introduce a measure called the marker ambiguity score (MAS) , and evaluate its utility by simulation studies based on a real dataset of 2949 SNPs spanning a region of 56.1M bp. We show that the MAS method can be used to assess the level of boundary ambiguity caused by varying ethnic background, sample sizes for hapmap construction, and disease aggregation. We find a striking difference in overall patterns of block boundary distributions in two ethnic groups (blacks and whites), and subtle changes in block structures that agree with the evolutionary history of the two populations. Our analyses suggest that a sample size of 200 or more subjects is probably needed for "stable" hapmap constructions. In addition, we demonstrate that there are subtle changes in block boundaries in hapmaps constructed in disease populations versus normal controls. This approach can quantify the information content of individual markers in the context of highly dense SNP data, which may have important implications in designing efficient genome-wide association mapping projects.
KW - Ethnic origin
KW - Genome-wide association
KW - Haplotype block structure
KW - Hapmap
KW - Sample sizes
KW - Tagging SNPs
UR - http://www.scopus.com/inward/record.url?scp=33845260536&partnerID=8YFLogxK
U2 - 10.1111/j.1469-1809.2006.00315.x
DO - 10.1111/j.1469-1809.2006.00315.x
M3 - Article
C2 - 16984487
AN - SCOPUS:33845260536
SN - 0003-4800
VL - 71
SP - 127
EP - 140
JO - Annals of Human Genetics
JF - Annals of Human Genetics
IS - 1
ER -