Metabolic forest: Predicting the diverse structures of drug metabolites

Tyler B. Hughes, Na Le Dang, Ayush Kumar, Noah R. Flynn, S. Joshua Swamidass

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Adverse drug metabolism often severely impacts patient morbidity and mortality. Unfortunately, drug metabolism experimental assays are costly, inefficient, and slow. Instead, computational modeling could rapidly flag potentially toxic molecules across thousands of candidates in the early stages of drug development. Most metabolism models focus on predicting sites of metabolism (SOMs): the specific substrate atoms targeted by metabolic enzymes. However, SOMs are merely a proxy for metabolic structures: knowledge of an SOM does not explicitly provide the actual metabolite structure. Without an explicit metabolite structure, computational systems cannot evaluate the new molecule’s properties. For example, the metabolite’s reactivity cannot be automatically predicted, a crucial limitation because reactive drug metabolites are a key driver of adverse drug reactions (ADRs). Additionally, further metabolic events cannot be forecast, even though the metabolic path of the majority of substrates includes two or more sequential steps. To overcome the myopia of the SOM paradigm, this study constructs a well-defined system-termed the metabolic forest-for generating exact metabolite structures. We validate the metabolic forest with the substrate and product structures from a large, chemically diverse, literature-derived dataset of 20 736 records. The metabolic forest finds a pathway linking each substrate and product for 79.42% of these records. By performing a breadth-first search of depth two or three, we improve performance to 88.43 and 88.77%, respectively. The metabolic forest includes a specialized algorithm for producing accurate quinone structures, the most common type of reactive metabolite. To our knowledge, this quinone structure algorithm is the first of its kind, as the diverse mechanisms of quinone formation are difficult to systematically reproduce. We validate the metabolic forest on a previously published dataset of 576 quinone reactions, predicting their structures with a depth three performance of 91.84%. The metabolic forest accurately enumerates metabolite structures, enabling promising new directions such as joint metabolism and reactivity modeling.

Original languageEnglish
Pages (from-to)4702-4716
Number of pages15
JournalJournal of Chemical Information and Modeling
Issue number10
StatePublished - Oct 26 2020


Dive into the research topics of 'Metabolic forest: Predicting the diverse structures of drug metabolites'. Together they form a unique fingerprint.

Cite this