<< Chapter < Page Chapter >> Page >
Alternative conformations for HIV-1 protease. Tube representation of HIV-1 protease (PDB codes 4HVP and 1AID) bound to different inhibitors represented by spheres. The plasticity of the binding site of the protein allows the protease to change its shape in order to accommodate ligands with widely different shapes and volumes.

Applying the PCA procedure as outlined before to a set of HIV-1 samples from simulation produces 1,782-dimensional principal components. Since the physical interpretation of the PCs is quite intuitive in this case, the PC coordinates can be split in groups of 3 to obtain the (x,y,z) components for each of the 594 atoms. These components are 3-dimensional vectors that point in the direction each atom would follow along the first PC. In figure 6 a), the per-atom components of the first PC have been superimposed in purple.

First mode of motion for the HIV-1 protease. a) The purple arrows are a convenient representation of the first PC grouped every 3 coordinates -(x,y,z) for each atom- to indicate the linear path each atom would follow. Note that the "flaps" have the most important motion component, which is consistent with the simulation data. b) A reference structure (middle) can be interpolated along the first PC in a negative direction (left) or a positive one (right). Using only one degree of freedom, the flap motion can be approximated quite accurately.

Figure 6 b) shows the effect of interpolating HIV-1 conformations along the first PC, or first mode of motion. Starting from an aligned conformation from the original data set, multiples of the first PC can be added to produce interpolated conformations. Note that the first mode of motion corresponds mostly to the "opening" and "closing" of the flaps, as can be inferred from the relative magnitued of the first PC components in the flap region. Thus, interpolating in the direction of the first PC produces an approximation of this motion, but using only one degree of freedom. This way, the complex dynamics of the system and the 1,782 apparent degrees of freedom have been approximated by just one, effectively reducing the dimensionality of the representation.

The residual variance (solid line) and percentage of overall variance explained (dashed line) after each principal component.

Figure 7 (solid line) shows the residual variance left unexplained by discarding the lower-ranked principal components. Residual variance plots always decrease, and in this case, the first PC accounts for approximately 40% of the total variance along the motion (the dashed line shows the percentage of total variance explained up to the given PC). Also, the first 10 PCs account for more than 70% of the total data variance. Given the dominance of only a few degrees of freedom it is possible to represent the flexibility of the protein in a drastically reduced search space.

Non-linear methods

PCA falls in the category of what is called a linear method, since mathematically, the PCs are computed as a series of linear operations on the input coordinates. Linear methods such as PCA work well only when the collective atom motions are small (or linear), which is hardly the case for most interesting biological processes. Non-linear dimensionality reduction methods do exist, but are normally much more computationally expensive and have other disadvantages as well. However, non-linear methods are much more effective in describing complex processes using much fewer parameters.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Geometric methods in structural computational biology. OpenStax CNX. Jun 11, 2007 Download for free at http://cnx.org/content/col10344/1.6
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Geometric methods in structural computational biology' conversation and receive update notifications?

Ask