TY - JOUR
T1 - Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space
AU - AnVIL Team
AU - Schatz, Michael C.
AU - Philippakis, Anthony A.
AU - Afgan, Enis
AU - Banks, Eric
AU - Carey, Vincent J.
AU - Carroll, Robert J.
AU - Culotti, Alessandro
AU - Ellrott, Kyle
AU - Goecks, Jeremy
AU - Grossman, Robert L.
AU - Hall, Ira M.
AU - Hansen, Kasper D.
AU - Lawson, Jonathan
AU - Leek, Jeffrey T.
AU - Luria, Anne O.Donnell
AU - Mosher, Stephen
AU - Morgan, Martin
AU - Nekrutenko, Anton
AU - O'Connor, Brian D.
AU - Osborn, Kevin
AU - Paten, Benedict
AU - Patterson, Candace
AU - Tan, Frederick J.
AU - Taylor, Casey Overby
AU - Vessio, Jennifer
AU - Waldron, Levi
AU - Wang, Ting
AU - Wuichet, Kristin
AU - Baumann, Alexander
AU - Rula, Andrew
AU - Kovalsy, Anton
AU - Bernard, Clare
AU - Caetano-Anollés, Derek
AU - Van der Auwera, Geraldine A.
AU - Canas, Justin
AU - Yuksel, Kaan
AU - Herman, Kate
AU - Taylor, M. Morgan
AU - Simeon, Marianie
AU - Baumann, Michael
AU - Wang, Qi
AU - Title, Robert
AU - Munshi, Ruchi
AU - Chaluvadi, Sushma
AU - Reeves, Valerie
AU - Disman, William
AU - Thomas, Salin
AU - Hajian, Allie
AU - Kiernan, Elizabeth
AU - Gupta, Namrata
AU - Vosburg, Trish
AU - Geistlinger, Ludwig
AU - Ramos, Marcel
AU - Oh, Sehyun
AU - Rogers, Dave
AU - McDade, Frances
AU - Hastie, Mim
AU - Turaga, Nitesh
AU - Ostrovsky, Alexander
AU - Mahmoud, Alexandru
AU - Baker, Dannon
AU - Clements, Dave
AU - Cox, Katherine E.L.
AU - Suderman, Keith
AU - Kucher, Nataliya
AU - Golitsynskiy, Sergey
AU - Zarate, Samantha
AU - Wheelan, Sarah J.
AU - Kammers, Kai
AU - Stevens, Ana
AU - Hutter, Carolyn
AU - Wellington, Christopher
AU - Ghanaim, Elena M.
AU - Wiley, Ken L.
AU - Sen, Shurjo K.
AU - Di Francesco, Valentina
AU - s Yuen, Deni
AU - Walsh, Brian
AU - Sargent, Luke
AU - Jalili, Vahid
AU - Chilton, John
AU - Shepherd, Lori
AU - Stubbs, B. J.
AU - O'Farrell, Ash
AU - Vizzier, Benton A.
AU - Overbeck, Charles
AU - Reid, Charles
AU - Steinberg, David Charles
AU - Sheets, Elizabeth A.
AU - Lucas, Julian
AU - Blauvelt, Lon
AU - Cabansay, Louise
AU - Warren, Noah
AU - Hannafious, Brian
AU - Harris, Tim
AU - Reddy, Radhika
AU - Torstenson, Eric
AU - Banasiewicz, M. Katie
AU - Abel, Haley J.
AU - Walker, Jason
N1 - Publisher Copyright:
© 2021 The Author(s)
PY - 2022/1/12
Y1 - 2022/1/12
N2 - The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
AB - The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
UR - http://www.scopus.com/inward/record.url?scp=85127480763&partnerID=8YFLogxK
U2 - 10.1016/j.xgen.2021.100085
DO - 10.1016/j.xgen.2021.100085
M3 - Review article
C2 - 35199087
AN - SCOPUS:85127480763
SN - 2666-979X
VL - 2
JO - Cell Genomics
JF - Cell Genomics
IS - 1
M1 - 100085
ER -