TY - JOUR
T1 - Comparison of differential accessibility analysis strategies for ATAC-seq data
AU - Gontarz, Paul
AU - Fu, Shuhua
AU - Xing, Xiaoyun
AU - Liu, Shaopeng
AU - Miao, Benpeng
AU - Bazylianska, Viktoriia
AU - Sharma, Akhil
AU - Madden, Pamela
AU - Cates, Kitra
AU - Yoo, Andrew
AU - Moszczynska, Anna
AU - Wang, Ting
AU - Zhang, Bo
N1 - Publisher Copyright:
© 2020, The Author(s).
PY - 2020/12/1
Y1 - 2020/12/1
N2 - ATAC-seq is widely used to measure chromatin accessibility and identify open chromatin regions (OCRs). OCRs usually indicate active regulatory elements in the genome and are directly associated with the gene regulatory network. The identification of differential accessibility regions (DARs) between different biological conditions is critical in determining the differential activity of regulatory elements. Differential analysis of ATAC-seq shares many similarities with differential expression analysis of RNA-seq data. However, the distribution of ATAC-seq signal intensity is different from that of RNA-seq data, and higher sensitivity is required for DARs identification. Many different tools can be used to perform differential analysis of ATAC-seq data, but a comprehensive comparison and benchmarking of these methods is still lacking. Here, we used simulated datasets to systematically measure the sensitivity and specificity of six different methods. We further discussed the statistical and signal density cut-offs in the differential analysis of ATAC-seq by applying them to real data. Batch effects are very common in high-throughput sequencing experiments. We illustrated that batch-effect correction can dramatically improve sensitivity in the differential analysis of ATAC-seq data. Finally, we developed a user-friendly package, BeCorrect, to perform batch effect correction and visualization of corrected ATAC-seq signals in a genome browser.
AB - ATAC-seq is widely used to measure chromatin accessibility and identify open chromatin regions (OCRs). OCRs usually indicate active regulatory elements in the genome and are directly associated with the gene regulatory network. The identification of differential accessibility regions (DARs) between different biological conditions is critical in determining the differential activity of regulatory elements. Differential analysis of ATAC-seq shares many similarities with differential expression analysis of RNA-seq data. However, the distribution of ATAC-seq signal intensity is different from that of RNA-seq data, and higher sensitivity is required for DARs identification. Many different tools can be used to perform differential analysis of ATAC-seq data, but a comprehensive comparison and benchmarking of these methods is still lacking. Here, we used simulated datasets to systematically measure the sensitivity and specificity of six different methods. We further discussed the statistical and signal density cut-offs in the differential analysis of ATAC-seq by applying them to real data. Batch effects are very common in high-throughput sequencing experiments. We illustrated that batch-effect correction can dramatically improve sensitivity in the differential analysis of ATAC-seq data. Finally, we developed a user-friendly package, BeCorrect, to perform batch effect correction and visualization of corrected ATAC-seq signals in a genome browser.
UR - http://www.scopus.com/inward/record.url?scp=85086777968&partnerID=8YFLogxK
U2 - 10.1038/s41598-020-66998-4
DO - 10.1038/s41598-020-66998-4
M3 - Article
C2 - 32576878
AN - SCOPUS:85086777968
SN - 2045-2322
VL - 10
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 10150
ER -