Abstract

Objective: The diversity of nomenclature and naming strategies makes therapeutic terminology difficult to manage and harmonize. As the number and complexity of available therapeutic ontologies continues to increase, the need for harmonized cross-resource mappings is becoming increasingly apparent. This study creates harmonized concept mappings that enable the linking together of like-concepts despite source-dependent differences in data structure or semantic representation. Materials and Methods: For this study, we created Thera-Py, a Python package and web API that constructs searchable concepts for drugs and therapeutic terminologies using 9 public resources and thesauri. By using a directed graph approach, Thera-Py captures commonly used aliases, trade names, annotations, and associations for any given therapeutic and combines them under a single concept record. Results: We highlight the creation of 16 069 unique merged therapeutic concepts from 9 distinct sources using Thera-Py and observe an increase in overlap of therapeutic concepts in 2 or more knowledge bases after harmonization using Thera-Py (9.8%-41.8%). Conclusion: We observe that Thera-Py tends to normalize therapeutic concepts to their underlying active ingredients (excluding nondrug therapeutics, eg, radiation therapy, biologics), and unifies all available descriptors regardless of ontological origin. Lay Summary Working with therapeutic terminology in medicine is challenging due to the ambiguity associated with different naming strategies. A therapeutic can have many different types of identifiers across many vocabularies: natural product names, chemical structures, development codes, generic names, brand names, product formulations, or treatment regiments. This diversity of nomenclature makes therapeutic terminology uniquely difficult to manage and the need for harmonized cross-resource mappings is becoming increasingly apparent. To support these mappings, we introduce Thera-Py, a Python package and web API that constructs stable, searchable therapeutic concepts for drugs and therapeutic terminology. By using a directed graph approach, Thera-Py captures commonly used aliases, trade names, annotations, and associations for any given therapeutic and harmonizes them under a single merged concept record. Using this approach, we found that Thera-Py tends to normalize therapeutic concepts to their underlying active ingredients (excluding nondrug therapeutics, eg, radiation therapy, biologics) and unifies all available descriptors regardless of ontological origin. In this report, we highlight the creation of 16 069 unique merged therapeutic concepts from 9 distinct sources and observe an increased overlap of therapeutic concepts in commonly used knowledge bases after harmonization using Thera-Py.

Original languageEnglish
Article numberooad093
JournalJAMIA Open
Volume6
Issue number4
DOIs
StatePublished - Dec 1 2023

Keywords

  • biological ontologies
  • health information interoperability
  • knowledge bases
  • medical informatics
  • therapeutics

Fingerprint

Dive into the research topics of 'Normalization of drug and therapeutic concepts with Thera-Py'. Together they form a unique fingerprint.

Cite this