A large peptidome dataset improves HLA class I epitope prediction across most of the human population

Date Issued
2020-02Publisher Version
10.1038/s41587-019-0322-9Author(s)
Sarkizova, Siranush
Klaeger, Susan
Le, Phuong M.
Li, Letitia W.
Oliveira, Giacomo
Keshishian, Hasmik
Hartigan, Christina R.
Zhang, Wandi
Braun, David A.
Ligon, Keith L.
Bachireddy, Pavan
Zervantonakis, Ioannis K.
Rosenbluth, Jennifer M.
Ouspenskaia, Tamara
Law, Travis
Justesen, Sune
Stevens, Jonathan
Lane, William J.
Eisenhaure, Thomas
Lan Zhang, Guang
Clauser, Karl R.
Hacohen, Nir
Carr, Steven A.
Wu, Catherine J.
Keskin, Derin B.
Metadata
Show full item recordPermanent Link
https://hdl.handle.net/2144/41361Version
Accepted manuscript
Citation (published version)
Siranush Sarkizova, Susan Klaeger, Phuong M Le, Letitia W Li, Giacomo Oliveira, Hasmik Keshishian, Christina R Hartigan, Wandi Zhang, David A Braun, Keith L Ligon, Pavan Bachireddy, Ioannis K Zervantonakis, Jennifer M Rosenbluth, Tamara Ouspenskaia, Travis Law, Sune Justesen, Jonathan Stevens, William J Lane, Thomas Eisenhaure, Guang Lan Zhang, Karl R Clauser, Nir Hacohen, Steven A Carr, Catherine J Wu, Derin B Keskin. 2020. "A large peptidome dataset improves HLA class I epitope prediction across most of the human population.." Nat Biotechnol, Volume 38, Issue 2, pp. 199 - 209. https://doi.org/10.1038/s41587-019-0322-9Abstract
Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines.
Description
Published in final edited form as: Nat Biotechnol. 2020 February ; 38(2): 199–209. doi:10.1038/s41587-019-0322-9.
Collections
- MET: Scholarly Works [135]
- BU Open Access Articles [3664]