REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data

Jahangheer S. Shaik, Asis Khan, Stephen M. Beverley, L. David Sibley

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Background: Next-generation sequencing technology provides a means to study genetic exchange at a higher resolution than was possible using earlier technologies. However, this improvement presents challenges as the alignments of next generation sequence data to a reference genome cannot be directly used as input to existing detection algorithms, which instead typically use multiple sequence alignments as input. We therefore designed a software suite called REDHORSE that uses genomic alignments, extracts genetic markers, and generates multiple sequence alignments that can be used as input to existing recombination detection algorithms. In addition, REDHORSE implements a custom recombination detection algorithm that makes use of sequence information and genomic positions to accurately detect crossovers. REDHORSE is a portable and platform independent suite that provides efficient analysis of genetic crosses based on Next-generation sequencing data. Results: We demonstrated the utility of REDHORSE using simulated data and real Next-generation sequencing data. The simulated dataset mimicked recombination between two known haploid parental strains and allowed comparison of detected break points against known true break points to assess performance of recombination detection algorithms. A newly generated NGS dataset from a genetic cross of Toxoplasma gondii allowed us to demonstrate our pipeline. REDHORSE successfully extracted the relevant genetic markers and was able to transform the read alignments from NGS to the genome to generate multiple sequence alignments. Recombination detection algorithm in REDHORSE was able to detect conventional crossovers and double crossovers typically associated with gene conversions whilst filtering out artifacts that might have been introduced during sequencing or alignment. REDHORSE outperformed other commonly used recombination detection algorithms in finding conventional crossovers. In addition, REDHORSE was the only algorithm that was able to detect double crossovers. Conclusion: REDHORSE is an efficient analytical pipeline that serves as a bridge between genomic alignments and existing recombination detection algorithms. Moreover, REDHORSE is equipped with a recombination detection algorithm specifically designed for Next-generation sequencing data. REDHORSE is portable, platform independent Java based utility that provides efficient analysis of genetic crosses based on Next-generation sequencing data. REDHORSE is available at http://redhorse.sourceforge.net/.

Original languageEnglish
Article number133
JournalBMC genomics
Volume16
Issue number1
DOIs
StatePublished - Dec 12 2015

Keywords

  • Conventional crossovers
  • Double crossovers
  • Haploid genome
  • Merged allele file and allele extraction
  • Multiple sequence alignments
  • Next-generation sequencing
  • Recombination detection
  • Single nucleotide variations
  • Toxoplasma gondii

Fingerprint

Dive into the research topics of 'REDHORSE-REcombination and Double crossover detection in Haploid Organisms using next-geneRation SEquencing data'. Together they form a unique fingerprint.

Cite this