Development of advanced methods for large-scale transcriptomic profiling and application to screening of metabolism disrupting compounds
Reed, Eric R.
MetadataShow full item record
High-throughput transcriptomic profiling has become a ubiquitous tool to assay an organism transcriptome and to characterize gene expression patterns in different cellular states or disease conditions, as well as in response to molecular and pharmacologic perturbations. Refinements to data preparation techniques have enabled integration of transcriptomic profiling into large-scale biomedical studies, generally devised to elucidate phenotypic factors contributing to transcriptional differences across a cohort of interest. Understanding these factors and the mechanisms through which they contribute to disease is a principal objective of numerous projects, such as The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia. Additionally, transcriptomic profiling has been applied in toxicogenomic screening studies, which profile molecular responses of chemical perturbations in order to identify environmental toxicants and characterize their mechanisms-of-action. Further adoption of high-throughput transcriptomic profiling requires continued effort to improve and lower the costs of implementation. Accordingly, my dissertation work encompasses both the development and assessment of cost-effective RNA sequencing platforms, and of novel machine learning techniques applicable to the analyses of large-scale transcriptomic data sets. The utility of these techniques is evaluated through their application to a toxicogenomic screen in which our lab profiled exposures of adipocytes to metabolic disrupting chemicals. Such exposures have been implicated in metabolic dyshomeostasis, the predominant cause of obesity pathogenesis. Considering that an estimated 10% of the global population is obese, understanding the role these exposures play in disrupting metabolic balance has the potential to help combating this pervasive health threat. This dissertation consists of three sections. In the first section, I assess data generated by a highly-multiplexed RNA sequencing platform developed by our section, and report on its significantly better quality relative to similar platforms, and on its comparable quality to more expensive platforms. Next, I present the analysis of a toxicogenomic screen of metabolic disrupting compounds. This analysis crucially relied on novel supervised and unsupervised machine learning techniques which I specifically developed to take advantage of the experimental design we adopted for data generation. Lastly, I describe the further development, evaluation, and optimization of one of these methods, K2Taxonomer, into a computational tool for unsupervised molecular subgrouping of bulk and single-cell gene expression data, and for the comprehensive in-silico annotation of the discovered subgroups.
RightsAttribution-NonCommercial-NoDerivatives 4.0 International