Algorithm design for confident assignment of glycopeptides using Target-Decoy Analysis and data-independent acquisition

Nalehua, Mary Rachel

Algorithm design for confident assignment of glycopeptides using Target-Decoy Analysis and data-independent acquisition

Files

Nalehua_bu_0017E_20918.pdf(2.39 MB)

Date

2026

Authors

Nalehua, Mary Rachel

URI

https://hdl.handle.net/2144/52828

Abstract

Glycosylation is one of the most common post-translational modifications (PTMs) in eukaryotes. They play a key role in many biological pathways, including disease, cell-to-cell communication, and structural integrity. Glycans themselves are highly complex, consisting of a branching monosaccharide structure which may attach to multiple glycosylation sites across a protein. Methods for studying the glycoproteome rely on liquid chromatography tandem mass spectrometry (LC-MS/MS). The recent proliferation of Data-Independent Acquisition (DIA) mass spectrometry has expanded our ability to identify glycopeptides by fragmenting an entire sample but drastically increases complexity in what was already a complex assignment space. The branching structure of glycans and overlaps in glycopeptide structures require specialized solutions beyond what currently exist for proteomics or small PTMs. Additionally, most algorithms rely on Target-Decoy Analysis (TDA) for error estimation, but while scoring has evolved, TDA has not evolved with it. There is a need for glycoproteomics solutions that are sensitive to the specific challenges of glycopeptide assignment, error calculation, and DIA acquisition. In this work, I designed an algorithm to assign glycopeptides from DIA glycoproteomics data in a manner that is suitable for a variety of initial settings, including DIA window width. This algorithm builds on existing scoring methods and utilizes a glycan-permutation decoy method which improves the accuracy of our error estimation. I additionally analyze the effect of window width on our ability to assign DIA data for glycopeptides and recommend settings for DIA glycopeptide acquisitions. Finally, I evaluate three major assignment algorithms for their compliance with TDA assumptions and demonstrate that TDA struggles to correctly estimate error rates for glycoproteomics data.

Description

2026

License

Attribution-ShareAlike 4.0 International

cba

Collections

Boston University Theses & Dissertations

Full item page