TY - JOUR
T1 - Meta-analysis of crowdsourced data compendia suggests pan-disease transcriptional signatures of autoimmunity
AU - The OMiCC Jamboree Working Group
AU - Lau, William W.
AU - Sparks, Rachel
AU - Tsang, John S.
AU - Austin, James
AU - Bansal, Neha
AU - Candia, Julián
AU - Dancy, Ehren
AU - Elkins, Karen L.
AU - Faghihi-Kashani, Sara
AU - Gomez-Rodriguez, Julio
AU - Guedez, Liliana
AU - Guo, Yongjian
AU - Gutierrez, Maria J.
AU - Ho, Trung
AU - Horai, Reiko
AU - Huh, Sunmee
AU - Iwamura, Chie
AU - Joy, Jaimy
AU - Kang, Ju Gyeong
AU - Kaul, Sunil
AU - Lewandowski, Laura B.
AU - Liu, Candace
AU - Lu, Yong
AU - Manes, Nathan P.
AU - Mattapallil, Mary J.
AU - Memon, Sarfraz
AU - Jubayer Rahman, M.
AU - Rodrigues, Kameron B.
AU - Silva, Bruno
AU - Singh, Amit
AU - St. Leger, Anthony J.
AU - Tang, Jessica
AU - Thorpe, Abigail
AU - Xie, Hang
AU - Zhao, Yongge
AU - Zimmerman, Ofer
N1 - Publisher Copyright:
© 2016 Lau WW et al.
PY - 2016
Y1 - 2016
N2 - Background: The proliferation of publicly accessible large-scale biological data together with increasing availability of bioinformatics tools have the potential to transform biomedical research. Here we report a crowdsourcing Jamboree that explored whether a team of volunteer biologists without formal bioinformatics training could use OMiCC, a crowdsourcing web platform that facilitates the reuse and (meta-) analysis of public gene expression data, to compile and annotate gene expression data, and design comparisons between disease and control sample groups. Methods: The Jamboree focused on several common human autoimmune diseases, including systemic lupus erythematosus (SLE), multiple sclerosis (MS), type I diabetes (DM1), and rheumatoid arthritis (RA), and the corresponding mouse models. Meta-analyses were performed in OMiCC using comparisons constructed by the participants to identify 1) gene expression signatures for each disease (disease versus healthy controls at the gene expression and biological pathway levels), 2) conserved signatures across all diseases within each species (pan-disease signatures), and 3) conserved signatures between species for each disease and across all diseases (cross-species signatures). Results: A large number of differentially expressed genes were identified for each disease based on meta-analysis, with observed overlap among diseases both within and across species. Gene set/pathway enrichment of upregulated genes suggested conserved signatures (e.g., interferon) across all human and mouse conditions. Conclusions: Our Jamboree exercise provides evidence that when enabled by appropriate tools, a "crowd" of biologists can work together to accelerate the pace by which the increasingly large amounts of public data can be reused and meta-analyzed for generating and testing hypotheses. Our encouraging experience suggests that a similar crowdsourcing approach can be used to explore other biological questions.
AB - Background: The proliferation of publicly accessible large-scale biological data together with increasing availability of bioinformatics tools have the potential to transform biomedical research. Here we report a crowdsourcing Jamboree that explored whether a team of volunteer biologists without formal bioinformatics training could use OMiCC, a crowdsourcing web platform that facilitates the reuse and (meta-) analysis of public gene expression data, to compile and annotate gene expression data, and design comparisons between disease and control sample groups. Methods: The Jamboree focused on several common human autoimmune diseases, including systemic lupus erythematosus (SLE), multiple sclerosis (MS), type I diabetes (DM1), and rheumatoid arthritis (RA), and the corresponding mouse models. Meta-analyses were performed in OMiCC using comparisons constructed by the participants to identify 1) gene expression signatures for each disease (disease versus healthy controls at the gene expression and biological pathway levels), 2) conserved signatures across all diseases within each species (pan-disease signatures), and 3) conserved signatures between species for each disease and across all diseases (cross-species signatures). Results: A large number of differentially expressed genes were identified for each disease based on meta-analysis, with observed overlap among diseases both within and across species. Gene set/pathway enrichment of upregulated genes suggested conserved signatures (e.g., interferon) across all human and mouse conditions. Conclusions: Our Jamboree exercise provides evidence that when enabled by appropriate tools, a "crowd" of biologists can work together to accelerate the pace by which the increasingly large amounts of public data can be reused and meta-analyzed for generating and testing hypotheses. Our encouraging experience suggests that a similar crowdsourcing approach can be used to explore other biological questions.
KW - Autoimmunity
KW - Crowdsourcing
KW - Gene expression
KW - Human andmouse comparison
KW - Meta-analysis
KW - Mouse modelsof disease
KW - Public data
UR - http://www.scopus.com/inward/record.url?scp=85006816750&partnerID=8YFLogxK
U2 - 10.12688/f1000research.10465.1
DO - 10.12688/f1000research.10465.1
M3 - Comment/debate
AN - SCOPUS:85006816750
SN - 2046-1402
VL - 5
JO - F1000Research
JF - F1000Research
M1 - 2884
ER -