TY - JOUR
T1 - CORESH
T2 - A gene signature-based search engine for public gene expression datasets
AU - Sukhov, Vladimir
AU - Nugmanova, Aigul
AU - Vorontsov, Yury
AU - Mehrotra, Parul
AU - Kleverov, Maksim
AU - Ravichandran, Kodi
AU - Artyomov, Maxim
AU - Sergushichev, Alexey
N1 - Publisher Copyright:
© 2025 The Author(s).
PY - 2025/7/7
Y1 - 2025/7/7
N2 - Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one's own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature - such as a list of top upregulated genes in response to a treatment - in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of datasets where the input genes exhibit similar expression patterns. The discovered datasets can then be used to identify experimental conditions associated with the activation of the query signature, offering insights into underlying biological mechanisms and guiding experimental validation. CORESH is freely accessible at https://alserglab.wustl.edu/coresh/, requires no login, and is regularly updated with the latest GEO data.
AB - Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one's own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature - such as a list of top upregulated genes in response to a treatment - in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of datasets where the input genes exhibit similar expression patterns. The discovered datasets can then be used to identify experimental conditions associated with the activation of the query signature, offering insights into underlying biological mechanisms and guiding experimental validation. CORESH is freely accessible at https://alserglab.wustl.edu/coresh/, requires no login, and is regularly updated with the latest GEO data.
UR - https://www.scopus.com/pages/publications/105010731708
U2 - 10.1093/nar/gkaf372
DO - 10.1093/nar/gkaf372
M3 - Article
C2 - 40322919
AN - SCOPUS:105010731708
SN - 0305-1048
VL - 53
SP - W187-W192
JO - Nucleic acids research
JF - Nucleic acids research
IS - W1
ER -