Leaf: An open-source, model-agnostic, data-driven web application for cohort discovery and translational biomedical research

Nicholas J. Dobbins, Clifford H. Spital, Robert A. Black, Jason M. Morrison, Bas De Veer, Elizabeth Zampino, Robert D. Harrington, Bethene D. Britt, Kari A. Stephens, Adam B. Wilcox, Peter Tarczy-Hornoch, Sean D. Mooney

Research output: Contribution to journalArticlepeer-review

31 Scopus citations


Objective: Academic medical centers and health systems are increasingly challenged with supporting appropriate secondary use of clinical data. Enterprise data warehouses have emerged as central resources for these data, but often require an informatician to extract meaningful information, limiting direct access by end users. To overcome this challenge, we have developed Leaf, a lightweight self-service web application for querying clinical data from heterogeneous data models and sources. Materials and Methods: Leaf utilizes a flexible biomedical concept system to define hierarchical concepts and ontologies. Each Leaf concept contains both textual representations and SQL query building blocks, exposed by a simple drag-and-drop user interface. Leaf generates abstract syntax trees which are compiled into dynamic SQL queries. Results: Leaf is a successful production-supported tool at the University of Washington, which hosts a central Leaf instance querying an enterprise data warehouse with over 300 active users. Through the support of UW Medicine (https://uwmedicine.org), the Institute of Translational Health Sciences (https://www.iths.org), and the National Center for Data to Health (https://ctsa.ncats.nih.gov/cd2h/), Leaf source code has been released into the public domain at https://github.com/uwrit/leaf. Discussion: Leaf allows the querying of single or multiple clinical databases simultaneously, even those of different data models. This enables fast installation without costly extraction or duplication. Conclusions: Leaf differs from existing cohort discovery tools because it does not specify a required data model and is designed to seamlessly leverage existing user authentication systems and clinical databases in situ. We believe Leaf to be useful for health system analytics, clinical research data warehouses, precision medicine biobanks, and clinical studies involving large patient cohorts.

Original languageEnglish
Pages (from-to)109-118
Number of pages10
JournalJournal of the American Medical Informatics Association
Issue number1
StatePublished - Jan 1 2020


  • biomedical informatics
  • cloud computing
  • cohort discovery
  • data integration
  • leaf
  • observational health data sciences and informatics


Dive into the research topics of 'Leaf: An open-source, model-agnostic, data-driven web application for cohort discovery and translational biomedical research'. Together they form a unique fingerprint.

Cite this