<< Chapter < Page | Chapter >> Page > |
From the Delaunay triangulation the α-shape is computed by removing all edges, triangles, and tetrahedra that have circumscribing spheres withradius greater than α. Formally, the α-complex is the part of the Delaunay triangulation that remains after removing edges longer than α. The α-shape is the boundary of the α-complex.
Pockets can be detected by comparing the α-shape to the whole Delauney triangulation. Missing tetrahedra represent indentations, concavity, and generally negative space in the overall volume occupied by the protein. Particularly large or deep pockets may indicate a substrate binding site.
Regular α-shapes can be extended to deal with varying weights (i.e.,
spheres with different radii, such as different types of atoms)
. The formal definitions become complicated,
but the key idea is to use a pseudo distance measure that uses the weights.Suppose we have two atoms at positions p1 and p2 with weights w1 and w2.
Then the pseudo distance is defined as the square of the Euclidean distance minus the weights. The pseudo distanceis zero if and only if two spheres centered at p1 and p2 with radii equal
to
sqrt(w1)
and
sqrt(w2)
are just touching.
The volume of a molecule can be approximated using the space-filling model, in which each atom is modeled as a ball whose radius is α, where α is selected depending on the model being used: Van der Waals surface, molecular surface, solvent accessible surface, etc. Unfortunately, calculating the volume is not as simple as taking the sum of the ball volumes because they may overlap. Calculating the volume of a complex of overlapping balls is non-trivial because of the overlaps. If two spheres overlap, the volume is the sum of the volumes of the spheres minus the volume of the overlap, which was counted twice. If three overlap, the volume is the sum of the ball volumes, minus the volume of each pairwise overlap, plus the volume of the three-way overlap, which was subtracted one too many times in accounting for the pairwise overlaps. In the general case, all pairwise, three-way, four-way and so on to n-way intersections (assuming there are n atoms) must be considered. Proteins generally have thousands or tens of thousands of atoms, so the general n-way case may be computationally expensive and may introduce numerical error.
α-shapes provide a way around this undesirable combinatorial complexity , and this issue has been one of the motivating factors for introducing α-shapes. To calculate the volume of a protein, we take the sum of all ball volumes, then subtract only those pairwise intersections for which a corresponding edge exists in the α-complex. Only those three-way intersections for which the corresponding triangle is in the α-complex must then be added back. Finally, only four-way intersections corresponding to tetrahedra in the α-complex need to be subtracted. No higher-order intersections are necessary, and the number of volume calculations necessary corresponds directly to the complexity of the α-complex, which is O(n log n) in the number of atoms.
An example of how this approach works is given on page 4 of the Liang et al. article in the Recommended Reading section below. A proof of correctness and derivation is also provided in the article. Surface area calculations, such as solvent-accessible surface area, which is often used to estimate the strength of interactions between a protein and the solvent molecules surrounding it, are made by a similar use of the α-complex.
Notification Switch
Would you like to follow the 'Geometric methods in structural computational biology' conversation and receive update notifications?