TY - JOUR
T1 - YAHA
T2 - Fast and flexible long-read alignment with optimal breakpoint detection
AU - Faust, Gregory G.
AU - Hall, Ira M.
N1 - Funding Information:
Funding: This research was supported by an National Institute of Health (NIH) New Innovator Award DP2OD006493-01 (I.H.), a Burroughs Wellcome Fund Career Award (I.H.) and an NIH Biotechnology Training Grant T32 GM08715 (G.F.).
PY - 2012/10
Y1 - 2012/10
N2 - Motivation: With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this.Results: We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints.
AB - Motivation: With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this.Results: We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints.
UR - http://www.scopus.com/inward/record.url?scp=84867316247&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bts456
DO - 10.1093/bioinformatics/bts456
M3 - Article
C2 - 22829624
AN - SCOPUS:84867316247
SN - 1367-4803
VL - 28
SP - 2417
EP - 2424
JO - Bioinformatics
JF - Bioinformatics
IS - 19
M1 - bts456
ER -