Sequence- and structure-based approaches to deciphering enzyme evolution in the Haloalkonoate Dehalogenase superfamily
MetadataShow full item record
Understanding how changes in functional requirements of the cell select for changes in protein sequence and structure is a fundamental challenge in molecular evolution. This dissertation delineates some of the underlying evolutionary forces using as a model system, the Haloalkanoate Dehalogenase Superfamily (HADSF). HADSF members have unique cap-core architecture with the Rossmann-fold core domain accessorized by variable cap domain insertions (delineated by length, topology, and point of insertion). To identify the boundaries of variable domain insertions in protein sequences, I have developed a comprehensive computational strategy (CapPredictor or CP) using a novel sequence alignment algorithm in conjunction with a structure-guided sequence profile. Analysis of more than 40,000 HADSF sequences led to the following observations: (i) cap-type classes exhibit similar distributions across different phyla, indicating existence of all cap-types in the last universal common ancestor, and (ii) comparative analysis of the predicted cap-type and functional diversity indicated that cap-type does not dictate the divergence of substrate recognition and chemical pathway, and hence biological function. By analyzing a unique dataset of core- and cap-domain-only protein structures, I investigated the consequences of the accessory cap domain on the sequence-structure relationship of the core domain. The relationship between sequence and structure divergence in the core fold was shown to be monotonic and independent of the corresponding cap type. However, core domains with the same cap type bore a greater similarity than the core domains with different cap types, suggesting coevolution of the cap and core domains. Remarkably, a few degrees of freedom are needed to describe the structural diversity in the Rossmann fold accounting for the majority of the observed structural variance. Finally, I examined the location and role of conserved residue positions and co-evolving residue pairs in the core domain in the context of the cap domain. Positions critical for function were conserved while non-conserved positions mapped to highly mobile regions. Notably, we found exponential dependence of co-variance on inter-residue distance. Collectively, these novel algorithms and analyses contribute to an improved understanding of enzyme evolution, especially in the context of the use of domain insertions to expand substrate specificity and chemical mechanism.