Abstract
The CLC family of membrane proteins consists of chloride channels and anion/proton antiporters. How the same fold accommodates two distinct mechanisms remains poorly understood, and the small set of experimental structures provides limited insight. Here we show that it is possible to scale up CLC structural information using AlphaFold2 predictions and combine this with an ensemble-based machine learning approach to identify subtle structural differences associated with each mechanistic class. We first carried out a phylogenetic analysis on CLC sequences to infer 569 channels and 1051 transporter homologs of eukaryotic origin that were previously unidentified. We then examined AlphaFold2's ability to detect subtle differences among experimentally solved CLC structures using distance difference matrices and validated the use of these models in our study. Next, we trained a random forest classifier on channel versus transporter grouped distance data, generating a structure-based predictor for CLC subtypes. Shapley value analysis was then carried out to calculate importance values, allowing the identification of changes in helix-pair distances most strongly associated with the classifier's ability to predict CLC subtype. These differences are summarized by three concerted changes found in channels: reduced distance between dimerization interface helices and the C-terminal half of the protein, expansion of the anion transport pathway along the membrane, and insertion of an interfacial helix between αJ and αK. These changes overlap with observations from experimental structures, showing that this approach expands structural information across sequence space. This establishes a framework for large-scale structural analysis when experimental data is limited and may be useful for the study of other protein families.
| Original language | English |
|---|---|
| Article number | e70389 |
| Journal | Protein Science |
| Volume | 34 |
| Issue number | 12 |
| DOIs | |
| State | Published - Dec 2025 |
Keywords
- AlphaFold
- channel
- CLC
- distance difference matrix
- evolution
- machine learning
- phylogenetic
- random forest classifier
- structure
- transporter
Fingerprint
Dive into the research topics of 'A large-scale evolutionary and structural analysis of CLC channels and transporters'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver