In silico prediction of C57BL/6 mouse T-cell epitopes: enhancing immunogenicity assessment with PREDBL6

OA Version
Citation
Abstract
The MHC class I antigen processing pathway involves multiple steps: 1) the proteasome selectively cleaves intracellular proteins into short peptides, typically 8–11 amino acids long; 2) TAPs (transporter associated with antigen processing) selectively transport some of these peptides to the endoplasmic reticulum (ER); 3) aminopeptidases may further degrade the peptides in the ER; 4) some of the peptides bind MHC molecules, and finally; 5) peptide–MHC complexes are transported to the cell surface for recognition by CD8+ T cells. Peptides presented by MHC and recognized by T cells are called T-cell epitopes. MHC class I–restricted T cell epitopes play a crucial role in the immune surveillance of intracellular pathogens. As MHC binding is considered the most selective step in T cell recognition, many existing bioinformatics systems focus on modeling this step to predict MHC binders. However, modeling MHC binding alone is insufficient for accurate immunogenicity predictions, often resulting in false positives. With recent technological advancements, large amounts of mass spectrometry (MS)–identified MHC class I ligands became available to the public, making it possible to incorporate information from antigen processing steps before MHC binding. We collected >5,000 binding peptides and >4,000 eluted ligands for H2-Dᵇ, and >5,000 binding peptides and >5,000 eluted ligands for H2-Kᵇ. The thermostability assessment of MHC–peptide binding evaluates the strength and duration of the interaction between the peptide and the MHC molecule under varying temperatures. Studies have shown a positive correlation between thermostability and immunogenicity, as the stability of the peptide–MHC complexes affects the efficiency of antigen presentation and the downstream activation of T cells. Higher thermostability of peptide–MHC complexes allows for more extended interactions with T cell receptors, thus a higher likelihood of T cell activation. Our collaborators at the Dana-Farber Cancer Institute performed temperature gradient experiments to investigate the stability of peptide–MHC complexes for H2-Dᵇ and H2-Kᵇ alleles under three temperature conditions, 37 °C, 50 °C, and 70 °C, using the MS technique. The binding peptides were isolated using immunoprecipitation (IP) techniques. Peptides with lower binding stability tended to dissociate from the MHC molecules as the temperature increased, indicating reduced binding stability on the MHC surface. We ended up with over 3,000 H2-Dᵇ binding peptides and over 5,000 H2-Kᵇ binding peptides. The data enable us to perform a comprehensive thermostability analysis of MHC binding. In this thesis project, we developed a computational system for identifying T-cell epitopes in C57BL mice by integrating relevant contributing factors, such as the antigen processing steps before MHC binding and thermostability, with the MHC binding predictions. Utilizing deep learning methods, we first trained and rigorously validated the binding prediction models using naturally eluted H2-Kᵇ and H2-Dᵇ ligands collected from public resources. Then, we built Thermostability models using proprietary data generated by our collaborators. We compared the performance of our models with that of NetMHCPan-4.1, an online prediction tool validated by many benchmark studies to be one of the most accurate predictors. Our integrated model, combining the binding and the Thermostability models, exhibited superior predictive capabilities using an external validation dataset, surpassing the overall performance of the NetMHCPan-4.1 model. We consolidated the models into a user-friendly web-based application named PREDBL6 to facilitate accurate predictions of immunogenic peptides that stably bind H2ᵇ molecules and stimulate immune responses in C57BL/6 mice. To our knowledge, this is the first online T cell epitope prediction system that simulates MHC binding and considers other antigen processing steps and thermostability in a model organism. PREDBL6 is available at http://met-hilab.org:3001/tool.
Description
2024
License