A set of Centre dtude du Polymorphisme Humain (CEPH) cell lines serves as a large reference collection that has been widely used as a benchmark for allele frequencies in the analysis of genetic variants, to create linkage maps of the human genome, to study the genetics of gene expression, to provide samples to the HapMap and 1000 Genomes projects, and for a variety of other applications. An explicit feature of the CEPH collection is that these multigenerational families represent reference panels of known relatedness, consisting mostly of three-generation pedigrees with large sibships, two parents, and grandparents. We applied identity-by-state (IBS) and identity-by-descent (IBD) methods to high-density genotype data from 186 CEPH individuals in 13 families. We identified unexpected relatedness between nominally unrelated grandparents both within and between pedigrees. For one pair, the estimated Cotterman coefficient of relatedness k1 exceeded 0.2, consistent with one-eighth sharing (eg, first-cousins). Unexpectedly, significant IBD2 values were discovered in both second-degree and parent-child relationships. These were accompanied by regions of homozygosity in the offspring, which corresponded to blocks lacking IBS0 in purportedly unrelated parents, consistent with inbreeding. Our findings support and extend a 1999 report, based on the use of short tandem-repeat polymorphisms, that several CEPH families had regions of homozygosity consistent with autozygosity. We benchmarked our IBD approach (called kcoeff) against both RELPAIR and PREST software packages. Our findings may affect the interpretation of previous studies and the design of future studies that rely on the CEPH resource.
- single-nucleotide polymorphisms