<< Chapter < Page | Chapter >> Page > |
Markovian State Models (MSM) are roadmaps constructed by running many molecular simulations (Monte Carlo and molecular dynamics) and merging the trajectories. The method starts with a simulation that runs from the folded to unfolded state. It then picks a structure at random (call it c) from this trajectory and run a new simulation. If this new simulation reaches the unfolded state, then the next trajectory from which we will pick a structure will consist of the old trajectory from the folded state to c, and the new trajectory from c to the unfolded state. If the new trajectory reaches the folded state, we do the opposite. If neither happens in a reasonable time, we reject the new trajectory and start over. This is repeated a set number of times, and each time a trajectory is accepted, all states from the new part of the trajectory are added to the growing roadmap as nodes, and each transition from the trajectory is added as an edge. When it is added, each edge is labeled with a transition probability of 1 and a transition time equal to the timestep of the simulation.
The goal of this method is to move roadmap methods closer to MD and MC simulations, and in particular to incorporate a notion of time, which follows from the use of simulation techniques in the sampling of new conformations.
Once all of the simulations have been run, nodes that are within some cutoff distance of each other by some similarity metric must be merged. To merge two nodes, one node is removed, and its edges are transferred to the other node. If this results in two edges between the same pair of nodes, the transition probabilities and times are defined as follows: Once all merges are complete, the transition probabilities for each edge are normalized to the range [0,1] such that the sum of all outgoing edge probabilities from a node is 1. Given the roadmap, Pfold values and folding times can be calculated using the edge probabilities and step times. The approach is the same as with SRS: A system of equations is set up, but instead of Pfold, the value of interest is the expected time for a simulation starting at node i to reach a folded state, called the mean first passage time (MFPT) . The system of equations is solved using standard numerical methods, as with SRS. On tests of a small protein, called tryptophan zipper beta hairpin, or TZ2, the predicted folding rates agreed well with experiment.
An important fact of all roadmap methods that attempt to extrapolate properties of the entire protein folding landscape is that there is inherent sampling error. The energy landscape of a protein is a continuous function, which roadmap methods attempt to approximate through discrete sampling. The researchers who developed the MSM method also developed a method to estimate the error of the folding rates estimated based on MSMs . While a complete description is beyond the scope of this module, the details are available in the 2005 paper by Singhal and Pande linked in the Recommended Reading section below. The error analysis allows them to generate a probability distribution for the folding times for each node in the MSM. Useful in its own right because it gives us an idea of how confident we can be in folding times generated by a given MSM, this analysis is especially useful for focusing sampling during the generation of an MSM.
Notification Switch
Would you like to follow the 'Geometric methods in structural computational biology' conversation and receive update notifications?