Automated sequence preprocessing in a large-scale sequencing environment

Michael C. Wendl, Simon Dear, Dave Hodgson, La Deana Hillier

Research output: Contribution to journalArticlepeer-review

27 Scopus citations


A software system for transforming fragments from four-color fluorescence-based gel electrophoresis experiments into assembled sequence is described. It has been developed for large-scale processing of all trace data, including shotgun and finishing reads, regardless of clone origin. Design considerations are discussed in detail, as are programming implementation and graphic tools. The importance of input validation, record tracking, and use of base quality values is emphasized. Several quality analysis metrics are proposed and applied to sample results from recently sequenced clones. Such quantities prove to be a valuable aid in evaluating modifications of sequencing protocol. The system is in full production use at both the Genome Sequencing Center and the Sanger Centre, for which combined weekly production is ~100,000 sequencing reads per week.

Original languageEnglish
Pages (from-to)975-984
Number of pages10
JournalGenome research
Issue number9
StatePublished - Sep 1998


Dive into the research topics of 'Automated sequence preprocessing in a large-scale sequencing environment'. Together they form a unique fingerprint.

Cite this