School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599
Methods of computational geometry provide a robust and effective approach to studying topology and architecture of proteins. In this approach, a protein is represented by the set of points in three-dimensional space, where each point designates an amino acid residue. The Delaunay tessellation of this set of points generates an aggregate of space-filling irregular tetrahedra, or Delaunay simplices. The vertices of each simplex define objectively four nearest neighbor residues and the collection of all simplices describes the topology of a protein structure. Results of statistical analysis of geometrical and compositional properties of the Delaunay simplices are used for a quantitative description of nonlocal contacts in three-dimensional protein structures. Analysis of the patterns of spatial proximity of residues in known protein structures based on the Delaunay tessellation reveals highly nonrandom clustering of amino acids. Relative abundance or deficiency of residue quadruplets with certain compositions reflects propensities of different residue types to be associated or disassociated in folded proteins. The likelihood of occurrence of four residues in one simplex displays strong nonrandom signal with the reduced amino acid alphabets as well. We used several reduced alphabets based on the residue chemical and structural properties and on the complementarity of the corresponding codons. In both cases, the clustering of residues correlates with their properties or genetic origin. The results of this analysis are being implemented in algorithms for protein structure classification and prediction.