Optimization methods for side-chain positioning and macromolecular docking
MetadataShow full item record
This dissertation proposes new optimization algorithms targeting protein-protein docking which is an important class of problems in computational structural biology. The ultimate goal of docking methods is to predict the 3-dimensional structure of a stable protein-protein complex. We study two specific problems encountered in predictive docking of proteins. The first problem is Side-Chain Positioning (SCP), a central component of homology modeling and computational protein docking methods. We formulate SCP as a Maximum Weighted Independent Set (MWIS) problem on an appropriately constructed graph. Our formulation also considers the significant special structure of proteins that SCP exhibits for docking. We develop an approximate algorithm that solves a relaxation of MWIS and employ randomized estimation heuristics to obtain high-quality feasible solutions to the problem. The algorithm is fully distributed and can be implemented on multi-processor architectures. Our computational results on a benchmark set of protein complexes show that the accuracy of our approximate MWIS-based algorithm predictions is comparable with the results achieved by a state-of-the-art method that finds an exact solution to SCP. The second problem we target in this work is protein docking refinement. We propose two different methods to solve the refinement problem. The first approach is based on a Monte Carlo-Minimization (MCM) search to optimize rigid-body and side-chain conformations for binding. In particular, we study the impact of optimally positioning the side-chains in the interface region between two proteins in the process of binding. We report computational results showing that incorporating side-chain flexibility in docking provides substantial improvement in the quality of docked predictions compared to the rigid-body approaches. Further, we demonstrate that the inclusion of unbound side-chain conformers in the side-chain search introduces significant improvement in the performance of the docking refinement protocols. In the second approach, we propose a novel stochastic optimization algorithm based on Subspace Semi-Definite programming-based Underestimation (SSDU), which aims to solve protein docking and protein structure prediction. SSDU is based on underestimating the binding energy function in a permissive subspace of the space of rigid-body motions. We apply Principal Component Analysis (PCA) to determine the permissive subspace and reduce the dimensionality of the conformational search space. We consider the general class of convex polynomial underestimators, and formulate the problem of finding such underestimators as a Semi-Definite Programming (SDP) problem. Using these underestimators, we perform a biased sampling in the vicinity of the conformational regions where the energy function is at its global minimum. Moreover, we develop an exploration procedure based on density-based clustering to detect the near-native regions even when there are many local minima residing far from each other. We also incorporate a Model Selection procedure into SSDU to pick a predictive conformation. Testing our algorithm over a benchmark of protein complexes indicates that SSDU substantially improves the quality of docking refinement compared with existing methods.