<< Chapter < Page | Chapter >> Page > |
The Protein Data Bank (PDB) is a public domain repository containing experimentally determined structures of three-dimensional biological macromolecules. The majority of these structures have been determined by x-ray crystallography, but structures determined using nuclear magnetic resonance (NMR) methods are on the rise. A very few theoretical models are also included in the PDB. The PDB was originally established at Brookhaven National Laboratory (1) in October, 1971, with 7 structures. It is currently managed by Rutgers, (2) The State University of New Jersey, the San Diego Supercomputer Center at the University of California, San Diego, andthe Center for Advanced Research in Biotechnology/UMBI/NIST, and it stores over 29,000 structures. The European Bioinformatics Institute Macromolecular Structure Database group (UK) and the Protein Research Institute at Osaka University, Japan are international contributors to the contents of the PDB.
The name Protein Data Bank is historical in origin, because the present-day PDB includes many DNA and RNA structures as well. The most important information contained in any given PDB file is a set of 3-dimensional vectors representing the atomic coordinates for each of the individual atoms that comprise the biological molecule(s) included in the structure. These coordinates can be fed into various graphics programs that allow the scientist to view the a 3-dimensional model of molecule. An example of one of these models is the molecule of the month that can be viewed by clicking on the link in the left-hand blue border of the PDB home page .
What is the featured molecule of the month today?
The PDB search engine asks the user to "Enter a PDB ID or keyword". The PDB ID, which is also referred to in journal articles as the PDB accession code, is a 4 character alphanumberic ID code assigned to the structure coordinate file when it is deposited into the PDB by the experimentalist who solved the structure. For instance, the PDB ID for a 2.2 angstrom crystal structure of the protein calmodulin is 3CLN. Perform a search using 3CLN as the query. The result is a summary page that lists the method of structure determination, as well as the authors of the structure. It is not uncommon for the authors of the primary citation to differ somewhat from the authors of the structure, as the first refers to the writer(s) of the article where the new structure first appeared and the second refers to the experimentalist(s) who determined the structure of the deposited molecule(s). The compound field identifies the common name of the protein or nucleic acid molecule in the structure, and the source identifies the genus and species of the organism from which this molecule is derived, which in this case is a rat. At the bottom of the summary, is a table entitled HET groups. HET (heteroatom) refers to any atom that is not part of the biological molecule(s) in the structure.These are often ligands, which are molecules that commonly bind the particular protein or nucleic acid in the structure. In this case, there is calcium in the structure, which may not be surprising, given that the classification listed for this protein in the summary is "Calcium Binding Protein". The formula column of this table gives the chemical formula of the ligand.
Notification Switch
Would you like to follow the 'Bios 533 bioinformatics' conversation and receive update notifications?