Abstract

Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one's own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature - such as a list of top upregulated genes in response to a treatment - in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of datasets where the input genes exhibit similar expression patterns. The discovered datasets can then be used to identify experimental conditions associated with the activation of the query signature, offering insights into underlying biological mechanisms and guiding experimental validation. CORESH is freely accessible at https://alserglab.wustl.edu/coresh/, requires no login, and is regularly updated with the latest GEO data.

Original languageEnglish
Pages (from-to)W187-W192
JournalNucleic acids research
Volume53
Issue numberW1
DOIs
StatePublished - Jul 7 2025

Fingerprint

Dive into the research topics of 'CORESH: A gene signature-based search engine for public gene expression datasets'. Together they form a unique fingerprint.

Cite this