Recent studies have noted that the boundary of common haplotype blocks in hapmap constructions involve a certain degree of ambiguity, and so do the resulting "tagSNPs". Here, we report how to address this issue at the level of individual SNP markers. We introduce a measure called the marker ambiguity score (MAS) , and evaluate its utility by simulation studies based on a real dataset of 2949 SNPs spanning a region of 56.1M bp. We show that the MAS method can be used to assess the level of boundary ambiguity caused by varying ethnic background, sample sizes for hapmap construction, and disease aggregation. We find a striking difference in overall patterns of block boundary distributions in two ethnic groups (blacks and whites), and subtle changes in block structures that agree with the evolutionary history of the two populations. Our analyses suggest that a sample size of 200 or more subjects is probably needed for "stable" hapmap constructions. In addition, we demonstrate that there are subtle changes in block boundaries in hapmaps constructed in disease populations versus normal controls. This approach can quantify the information content of individual markers in the context of highly dense SNP data, which may have important implications in designing efficient genome-wide association mapping projects.

Original languageEnglish
Pages (from-to)127-140
Number of pages14
JournalAnnals of Human Genetics
Issue number1
StatePublished - Jan 2007


  • Ethnic origin
  • Genome-wide association
  • Haplotype block structure
  • Hapmap
  • Sample sizes
  • Tagging SNPs


Dive into the research topics of 'Measuring marker information content by the ambiguity of block boundaries observed in dense SNP data'. Together they form a unique fingerprint.

Cite this