<< Chapter < Page | Chapter >> Page > |
An approach that uses an atomistic model for protein folding in a solvent environment is being taken by The Stanford University (2) Folding@home project, using large scale distributed computing that allows timescales thousands to millions of times longer than previously achievable with a model of this detail. Look at the menu on the left border of the Stanford Folding@home web page. Click on the "Science" link to read the scientific background behind the protein folding distributed computing project.
What are the 3 functions of proteins that are mentioned in the "What are proteins?" section of the scientific background?
What are 3 diseases that are believed to result from protein misfolding?
What are typical timescales for molecular dynamics simulations?
What are typical timescales at which the fastest proteins fold?
How does the Stanford group break the microsecond barrier with their simulations?
Return to the Stanford Folding@home home page. Click on the "Results" link in the left border of the web page. Look at the information on the folding simulations of the villin headpiece.
How many amino acids are in the simulated villin headpiece?
How does this compare with the number of amino acids in a typical protein?
Taking into consideration the size of the biological molecules in these simulations and the requirements that necessitated using large scale distributed computing methods for the simulations, what are the biggest impediments to understanding the protein folding problem?
Although attempts at predicting tertiary and quaternary structure from the amino acid sequence of proteins are relatively new, methods for predicting protein secondary structure have been in existence for some time. Depending on the method, secondary structure predictions can be performed with approximately 60 - 70% accuracy. Originally, empirical prediction methods were based on tables which listed each amino acid and the frequency with which that amino acid was found in alpha-helices, beta-sheets, turns and random coil. Currently, prediction methods usually employ machine learning in the form of neural networks that are trained with test sets consisting of sequences with known structure. In these cases, the selection of the test set is critically related to the accuracy of the method. However, given the ever increasing number of known structural folds, selecting a representative test set that includes many proteins of diverse structure has become easier.
Use the amino sequence below to explore some structure prediction tools. This is the sequence for lac repressor, a protein involved in gene regulation that is known to have both alpha-helical and beta-sheet structure:
>gi|33112645|sp|P03023|LACI_ECOLI Lactose operon repressor
MKPVTLYDVAEYAGVSYQTVSRVVNQASHVSAKTREKVEAAMAELNYIPNRVAQQLAGKQSLLIGVATSSLALHAPSQIVAAIKSRADQLGASVVVSMVERSGVEACKAAVHNLLAQRVSGLIINYPLDDQDAIAVEAAC
TNVPALFLDVSDQTPINSIIFSHEDGTRLGVEHLVALGHQQIALLAGPLSSVSARLRLAGWHKYLTRNQIQPIAEREGDWSAMSGFQQTMQMLNEGIVPTAMLVANDQMALGAMRAITESGLRVGADISVVGYDDTEDSS
CYIPPLTTIKQDFRLLGQTSVDRLLQLSQGQAVKGNQLLPVSLVKRKTTLAPNTQTASPRALADSLMQLARQVSRLESGQ
Notification Switch
Would you like to follow the 'Bios 533 bioinformatics' conversation and receive update notifications?