TY - JOUR
T1 - GAiN
T2 - An integrative tool utilizing generative adversarial neural networks for augmented gene expression analysis
AU - Waters, Michael R.
AU - Inkman, Matthew
AU - Jayachandran, Kay
AU - Kowalchuk, Roman M.
AU - Robinson, Clifford
AU - Schwarz, Julie K.
AU - Swamidass, S. Joshua
AU - Griffith, Obi L.
AU - Szymanski, Jeffrey J.
AU - Zhang, Jin
N1 - Publisher Copyright:
© 2023 The Author(s)
PY - 2024/2/9
Y1 - 2024/2/9
N2 - Big genomic data and artificial intelligence (AI) are ushering in an era of precision medicine, providing opportunities to study previously under-represented subtypes and rare diseases rather than categorize them as variances. However, clinical researchers face challenges in accessing such novel technologies as well as reliable methods to study small datasets or subcohorts with unique phenotypes. To address this need, we developed an integrative approach, GAiN, to capture patterns of gene expression from small datasets on the basis of an ensemble of generative adversarial networks (GANs) while leveraging big population data. Where conventional biostatistical methods fail, GAiN reliably discovers differentially expressed genes (DEGs) and enriched pathways between two cohorts with limited numbers of samples (n = 10) when benchmarked against a gold standard. GAiN is freely available at GitHub. Thus, GAiN may serve as a crucial tool for gene expression analysis in scenarios with limited samples, as in the context of rare diseases, under-represented populations, or limited investigator resources.
AB - Big genomic data and artificial intelligence (AI) are ushering in an era of precision medicine, providing opportunities to study previously under-represented subtypes and rare diseases rather than categorize them as variances. However, clinical researchers face challenges in accessing such novel technologies as well as reliable methods to study small datasets or subcohorts with unique phenotypes. To address this need, we developed an integrative approach, GAiN, to capture patterns of gene expression from small datasets on the basis of an ensemble of generative adversarial networks (GANs) while leveraging big population data. Where conventional biostatistical methods fail, GAiN reliably discovers differentially expressed genes (DEGs) and enriched pathways between two cohorts with limited numbers of samples (n = 10) when benchmarked against a gold standard. GAiN is freely available at GitHub. Thus, GAiN may serve as a crucial tool for gene expression analysis in scenarios with limited samples, as in the context of rare diseases, under-represented populations, or limited investigator resources.
KW - DSML3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems
KW - deep learning GANs
KW - differential gene expression
KW - gene expression analysis
KW - generative modeling
KW - high-throughput sequencing data
KW - pathway enrichment
KW - small sample sizes
KW - structural gene expression patterns
KW - synthetic RNA expression datasets
UR - http://www.scopus.com/inward/record.url?scp=85184516567&partnerID=8YFLogxK
U2 - 10.1016/j.patter.2023.100910
DO - 10.1016/j.patter.2023.100910
M3 - Article
C2 - 38370125
AN - SCOPUS:85184516567
SN - 2666-3899
VL - 5
JO - Patterns
JF - Patterns
IS - 2
M1 - 100910
ER -