TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses

Kevin Wu, Kathryn E. Yost, Bence Daniel, Julia A. Belk, Yu Xia, Takeshi Egawa, Ansuman Satpathy, Howard Y. Chang, James Zou

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

The T-cell receptor (TCR) allows T-cells to recognize and respond to antigens presented by infected and diseased cells. However, due to TCRs’ staggering diversity and the complex binding dynamics underlying TCR antigen recognition, it is challenging to predict which antigens a given TCR may bind to. Here, we present TCR-BERT, a deep learning model that applies self-supervised transfer learning to this problem. TCR-BERT leverages unlabeled TCR sequences to learn a general, versatile representation of TCR sequences, enabling numerous downstream applications. TCR-BERT can be used to build state-of-the-art TCR-antigen binding predictors with improved generalizability compared to prior methods. Simultaneously, TCR-BERT’s embeddings yield clusters of TCRs likely to share antigen specificities. It also enables computational approaches to challenging, unsolved problems such as designing novel TCR sequences with engineered binding affinities. Importantly, TCR-BERT enables all these advances by focusing on residues with known biological significance.

Original languageEnglish
Pages (from-to)194-229
Number of pages36
JournalProceedings of Machine Learning Research
Volume240
StatePublished - 2023
Event18th Machine Learning in Computational Biology Meeting, MLCB 2023 - Seattle, United States
Duration: Nov 30 2023Dec 1 2023

Fingerprint

Dive into the research topics of 'TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses'. Together they form a unique fingerprint.

Cite this