The fundamentals of genome-scale metabolic models and their application to the study of evolution and cancer

OA Version
Citation
Abstract
Hundreds to thousands of distinct metabolic reactions occur in all cells, forming a densely interconnected metabolic network that transforms similarly numerous metabolites into each other. Genome-Scale Metabolic Models (GSMMs) encode all existing knowledge about the structures of these metabolic networks, the enzymes responsible for catalyzing their reactions, the genes that encode those enzymes, and the metabolites that they interact with. Integration of different forms of high-throughput data within a single GSMM has facilitated numerous biological insights, ranging from strategies for engineering the metabolisms of microbes to produce commercially and/or medically valuable compounds to identifying novel drug targets for cancer, diabetes, inborn errors of metabolism, infectious diseases, among others. Due to the complexity of cellular metabolic networks and the limited availability of relevant experimental data, the predictive utility of GSMMs is often limited by missing or inaccurate reactions. Furthermore, common approaches to predicting metabolic fluxes from GSMMs often focus on identifying a single optimal flux state, which frequently leads to inaccurate predictions for specific cell types or disease states where biologically plausible metabolic optima are unknown or challenging to formally define. This dissertation addresses several limitations of existing approaches to creating and using GSMMs, with particular emphasis on the following challenges: (i) testing for the presence of reactions which can sustain unrealistically high fluxes, duplicate reactions, and missing or misannotated reactions; and (ii) predicting biologically and statistically sound distributions of steady-state fluxes through GSMMs, including methods which involve the incorporation of transcriptomics and/or proteomics data from particular conditions, which are especially relevant for the development of tissue-, disease- and patient-specific GSMMs. In addition, I extend the techniques for predicting fluxes through GSMMs to artificial chemistry networks — abstract models of simplified chemical reaction networks, which have been used to study general principles governing the behavior of such networks while avoiding the incompleteness of our understanding of real biochemistry. Specifically, I use these artificial chemistry networks to study general principles governing the evolution of the structures of metabolic networks, and demonstrate the importance of the biomass composition in determining intracellular network architecture. Throughout the dissertation, I present multiple tools and recommendations for improving the predictive quality of GSMMs and demonstrate their utility by correcting several hundred errors in the most recent GSMM of generic human cells, with possible broad implications for the field of metabolic modeling and its applications.
Description
2025
License
Attribution-NonCommercial-ShareAlike 4.0 International