TY - JOUR
T1 - AIAP
T2 - A Quality Control and Integrative Analysis Package to Improve ATAC-seq Data Analysis
AU - Liu, Shaopeng
AU - Li, Daofeng
AU - Lyu, Cheng
AU - Gontarz, Paul M.
AU - Miao, Benpeng
AU - Madden, Pamela A.F.
AU - Wang, Ting
AU - Zhang, Bo
N1 - Funding Information:
This work was supported by the National Institutes of Health (Grant Nos. U24ES026699, U01HG009391, and R25DA027995), the Goldman Sachs Philanthropy Fund (Emerson Collective), and Chan Zuckerberg Initiative, United States.
Publisher Copyright:
© 2021 The Authors
PY - 2021/8
Y1 - 2021/8
N2 - Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) is a technique widely used to investigate genome-wide chromatin accessibility. The recently published Omni-ATAC-seq protocol substantially improves the signal/noise ratio and reduces the input cell number. High-quality data are critical to ensure accurate analysis. Several tools have been developed for assessing sequencing quality and insertion size distribution for ATAC-seq data; however, key quality control (QC) metrics have not yet been established to accurately determine the quality of ATAC-seq data. Here, we optimized the analysis strategy for ATAC-seq and defined a series of QC metrics for ATAC-seq data, including reads under peak ratio (RUPr), background (BG), promoter enrichment (ProEn), subsampling enrichment (SubEn), and other measurements. We incorporated these QC tests into our recently developed ATAC-seq Integrative Analysis Package (AIAP) to provide a complete ATAC-seq analysis system, including quality assurance, improved peak calling, and downstream differential analysis. We demonstrated a significant improvement of sensitivity (20%–60%) in both peak calling and differential analysis by processing paired-end ATAC-seq datasets using AIAP. AIAP is compiled into Docker/Singularity, and it can be executed by one command line to generate a comprehensive QC report. We used ENCODE ATAC-seq data to benchmark and generate QC recommendations, and developed qATACViewer for the user-friendly interaction with the QC report. The software, source code, and documentation of AIAP are freely available at https://github.com/Zhang-lab/ATAC-seq_QC_analysis.
AB - Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) is a technique widely used to investigate genome-wide chromatin accessibility. The recently published Omni-ATAC-seq protocol substantially improves the signal/noise ratio and reduces the input cell number. High-quality data are critical to ensure accurate analysis. Several tools have been developed for assessing sequencing quality and insertion size distribution for ATAC-seq data; however, key quality control (QC) metrics have not yet been established to accurately determine the quality of ATAC-seq data. Here, we optimized the analysis strategy for ATAC-seq and defined a series of QC metrics for ATAC-seq data, including reads under peak ratio (RUPr), background (BG), promoter enrichment (ProEn), subsampling enrichment (SubEn), and other measurements. We incorporated these QC tests into our recently developed ATAC-seq Integrative Analysis Package (AIAP) to provide a complete ATAC-seq analysis system, including quality assurance, improved peak calling, and downstream differential analysis. We demonstrated a significant improvement of sensitivity (20%–60%) in both peak calling and differential analysis by processing paired-end ATAC-seq datasets using AIAP. AIAP is compiled into Docker/Singularity, and it can be executed by one command line to generate a comprehensive QC report. We used ENCODE ATAC-seq data to benchmark and generate QC recommendations, and developed qATACViewer for the user-friendly interaction with the QC report. The software, source code, and documentation of AIAP are freely available at https://github.com/Zhang-lab/ATAC-seq_QC_analysis.
KW - ATAC-seq
KW - Chromatin accessibility
KW - Data visualization
KW - Differential analysis
KW - Quality control
UR - http://www.scopus.com/inward/record.url?scp=85125627209&partnerID=8YFLogxK
U2 - 10.1016/j.gpb.2020.06.025
DO - 10.1016/j.gpb.2020.06.025
M3 - Article
C2 - 34273560
AN - SCOPUS:85125627209
VL - 19
SP - 641
EP - 651
JO - Genomics, Proteomics and Bioinformatics
JF - Genomics, Proteomics and Bioinformatics
SN - 1672-0229
IS - 4
ER -