<< Chapter < Page | Chapter >> Page > |
Augmentation is an application of depth first search that begins with the list of seed matches. Assuming that there are more than four motif points, we must find correspondences for the unmatched motif points within the target. Interpret the list of seed matches as a stack of partially complete matches. Pop off the first match, and considering the lRMSD alignment of this match, plot the position p of the next unmatched motif point relative to the aligned orientation of the motif. In the spherical region V around p, identify all target points , compatible with , inside V. Now compute the lRMSD alignment of all correlated points, include the new correlation ( , ). If the new alignment satisfies our first two criteria and there are no more unmatched motif points, put this match into a heap which maintains the match with smallest lRMSD. If there are more unmatched motif points, put this partial match back onto the stack. Continue to test correlations in this manner, until V contains no more target points that satisfy our criteria. Then, return to the stack, and begin again by popping off the first match on the stack, repeating this process until the stack is empty.
Structural similarity is important to functional annotation only if a strong correlation exists between identifiably significant structural similarity and functional similarity. However, the existence of a match alone does not guarantee functional similarity. lRMSD can be a differentiating factor. If matches of homologous proteins represent statistically significant structural similarity over what is expected by random chance, we could differentiate on lRMSD, as long as we can evaluate the statistical significance of the lRMSD of a match.
BLAST first calculated the statistical significance of sequence matches with a combinatorial model of the space of similar sequences. Determining the statistical significance of structural matches has also been attempted. Modeling was applied for the PINTS database to estimate the probability of a structural match given a particular LRMSD. An artificial distribution was parameterized by motif size and amino acid composition in order to fit a given data set, and the p-value is calculated relative to that distribution. Another approach was taken in the algorithm JESS , using comparative analysis to generate a significance score relative to a specific population of known motifs. Both methods have some disadvantages. The artificial models of PINTS are not parameterized by the geometry of motifs, and, all else equal, produce identical distributions for motifs of different geometry. JESS, on the other hand, is dependent on a set of known motifs; should this set change, all significance scores would have to be revised.
Local structural alignment methods operate on the assumption that local structural and chemical similarity implies functional similarity. A statistical model has been developed that can be used to identify the degree of similarity sufficient to follow this implication. Given a match m with lRMSD r between motif S and target T, exactly one of two hypotheses must hold:
Notification Switch
Would you like to follow the 'Geometric methods in structural computational biology' conversation and receive update notifications?