Estimation of allele frequencies from color-multiplexed electropherograms

David G. Politte, David R. Maffitt, David J. States

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations


Parametric model fitting of unprocessed sequencing-gel trace data and a least-squares optimization algorithm provide a method for accurately determining allele frequencies of single nucleotide substitutions in a population. The method uses trace data from two homozygous individuals and from either a heterozygous individual or a mixed population of templates. A parametric model is fit to each of the traces to estimate the amount of each of the four fluorescent dyes that is present at each site. The parameters estimated from each trace are then normalized to account for scalar variations due to differences in the amount of sample loaded. The parameters estimated from the trace of the heterozygous individual or from the mixture are viewed as a weighted sum of the parameters estimated from the traces of the homozygous individuals. The weights, or allele frequencies, are estimated by minimizing the sum of squared errors between the linear combination of homozygous traces and the mixed trace. Comparison of allele frequencies estimated by our method to known frequencies at polymorphic sites in three pools of CEPH individuals show that our method is accurate. Our method is automatic and much less labor-intensive than previous approaches.

Original languageEnglish
Number of pages5
StatePublished - 1998
EventProceedings of the 1998 2nd Annual International Conference on Computational Molecular Biology - New York, NY, USA
Duration: Mar 22 1998Mar 25 1998


ConferenceProceedings of the 1998 2nd Annual International Conference on Computational Molecular Biology
CityNew York, NY, USA


Dive into the research topics of 'Estimation of allele frequencies from color-multiplexed electropherograms'. Together they form a unique fingerprint.

Cite this