next up previous
Next: About this document ... Up: My Home Page

Protein Shapes

The tertiary structure of a protein specifies the location of each carbon atom along its backbone.

The secondary structure (helices and sheets) captures some notion of shape, but it does not suffice to accurately predict whether two proteins bind or dock together.

Predicting protein interactions arises critically in searching databases for potential drugs (rational drug design).

In protein docking, we seek to (1) predict the binding between two different proteins, or a protein and a flexible ligand, and (2) identify the orientation maximizing the interaction.

Protein Representations

A variety of different representations can be used for geometric protein structures:

All are somewhat of a fiction since molecules vibrate, move, and bend.

This flexibility limits our ability to use standard geometric algorithms and concepts.

Geometric Notions of Shape

The idea of the shape defined by a set of points is inherently difficult to define.

The convex hull of a set of points defines the smallest convex polygon which contain all of them.



figure=figures/hulla-1.eps,width=4in


The convex hull fails to pick up the cavities and protrusions which inherently make shapes interesting.

Structures based on connecting points to their nearest neighbors can recover if the points have been sampled densely enough.

The alpha-hull is a generalization of the convex hull, where the shape is defined by spheres of radius alpha, for some given value of $\alpha$.

An edge (face) between two (three) points is alpha exposed if there is a sphere of radius alpha which contact these points and contain no internal points.



figure=figures/hulla-2.eps,width=2in figure=figures/hulla-3.eps,width=2in


As alpha decreases, concavities get cut out from the convex hull.

The theory gives you little insight into which value of $\alpha$ defines your shape, except by trial and error.

Alpha Shape Examples

Two different alpha-shapes Gramacidin A, the latter highlighting the tunnel through the molecule:



figure=figures/gramacidin1.eps,width=2in figure=figures/gramacidin2.eps,width=2in


Myoglobin molecule with heme binding pocket:



figure=figures/myolin.eps,width=2in figure=figures/myopocket.eps,width=2in


HIV protease with inhibitor binding site:



figure=figures/hivmolecule.eps,width=2in figure=figures/hivpocket.eps,width=2in


The entire spectrum of alpha-hulls can be constructed in $O(n \log n)$ time in the plane, the same as for convex hulls.

Protein-Protein Docking

Typically, both proteins are modeled as rigid bodies, with the geometry used to constrain the possible sites of interaction.

Energy computations are performed at geometrically possible binding sites.

The ``right'' way to solve such problems is to construct the six dimensional configuration space of allowable positions of the second protein, and perform energy calculations at vertices/edges of the allowable region.

Protein-Ligand Docking

Modeling the interactions between a rigid protein and a small but flexible ligand is more complicated, since every hinge in the ligand increases the dimensionality of the problem.

Rough geometric interactions with parts of a ligand can be used to predict possible binding sites, but detailed energy calculations are needed to make precise predictions.

Docking Criteria

Preliminary screenings of possible docking sites can be based on maximizing the number of contact pairs or RMS distance.

The docking problem is not purely geometric, since attractive/repulsive forces have strong effects.

The best docking seeks to maximize the surface area and attractive forces while minimizing the energy loss due to solvent interaction.

Small ligands tend to bind in big pockets.

Motion Planning

Finding the best docking is a difficult algorithmic problem because it involves six degrees of freedom, the three possible translations (x, y, and z) and three possible rotations.

Binding flexible ligands is analogous to motion planning for articulated robot arms.



figure=figures/motion-planning-L.eps,width=2in figure=figures/motion-planning-R.eps,width=2in


Motion planning with many degrees of freedom becomes difficult as the complexity of the surfaces defining the conformation space grows.

A good general approach is to randomly sample points in the configuration space, and add edges between nearby collision-free points with collision-free straight line paths.

Heuristic Approaches

One approach to simplifying continuous geometric problems is to insist that all sites lie on a 3D grid.

The finer the grid, the more accurate the predictions, though at greater computational cost.

Another approach to discretization is to analyze the possible positions of isolated spheres which contact the surface, with pockets identified where there are many intersecting spheres.

Geometric hashing stores all possible point triplets (triangles) in both ligand and receptor.

The sets of triangles which match defines molecular orientations of interest.

Note that conventional hashing techniques do not really apply, since we are looking for approximate matches.



 
next up previous
Next: About this document ... Up: My Home Page
Steve Skiena
2000-11-21