<< Chapter < Page Chapter >> Page >

The level of relatedness of a set of sequences, therefore, directly effects which scoring matrix is most appropriate for aligning the set, whether ornot it is a PAM or a BLOSUM matrix. Comparisons of closely related sequences should use BLOSUM matrices with higher numbers and PAM matrices with lowernumbers. Conversely, BLOSUM matrices with low numbers and PAM matrices with high numbers are preferable for comparisons of distantly related proteins. Nevertheless, asingle matrix may be reasonably efficient over a relatively broad range of evolutionary change. The BLOSUM 62 matrix was chosen as thedefault for BLAST as a result of an analysis by Henikoff and Henikoff wherein BLOSUM 62 detected more distantrelationships in a BLAST search, and produced an alignment of diverged proteins more in agreement with three-dimensional structures, than didthe corresponding PAM 60 matrix. The BLOSUM series does not include any matrices suitable for very short query sequences, so, in these cases,the PAM matrices may be used instead. Berkeley has a Matrix Information website with a provisional table of recommended substitution matrices and gap costs for shorter sequences.

Now, take a look at some scoring matrices. A PAM Matrix website sponsored by Wageningen University, in the Netherlands, allows online computation of PAM matrices. The default value is a PAM 250 matrix; calculate this matrix and look atthe results. This PAM 250 matrix has a built-in gap penalty of -8, as seen in the * column.There are 24 rows and 24 columns. Of course, the first 20 are the amino acids, represented by the one letter code. B represents the case where there isambiguity between aspartate or asparigine, and Z is the case where there is ambiguity between glutamate or glutamine. X represents an unknown, ornonstandard amino acid.

In the PAM 250 matrix, where can the highest scores for each amino acid be found? Why?

Got questions? Get instant answers now!

Would this be true for any scoring matrix?

Got questions? Get instant answers now!

What row and column combination gives the highest score? (Specify the score value.)

Got questions? Get instant answers now!

What is the second highest score? (Specify the score value.)

Got questions? Get instant answers now!

Why are some scores for amino acid identities higher than others?

Got questions? Get instant answers now!

Use the back button on the browser, and calculate a PAM 100 matrix. Are the two highest scoring matches the same combination of rowand column as in the PAM 250 matrix? (Discuss with a sentence or two.)

Got questions? Get instant answers now!

What is the gap penalty?

Got questions? Get instant answers now!

Explain any differences in the gap penalties of thePAM 250 matrix versus the PAM 100 matrix.

Got questions? Get instant answers now!

To get an idea how the scoring matrix influences an alignment, perform the following exercise using the Biology Workbench . The Workbench will require a password (it's free), but it will grant entrance immediately upon registration of a password. Enter the site, and scroll down the page until the five menu buttons are visible. The "Session Tools" button allows the naming of a session, so that different jobs in progress can be saved under distinct sessions. Select "Session Tools", then select "Start New Session" and click on "Run" to change the name of "Default Session" to a new name. Once the workbench has been exited, the session will remain. Subsequently, clicking on the dot to the left of the session name under the "Session Tools" menu, and then selecting "Resume Session", will recall the session. The Workbench policy at the time of this writing is that old jobs are deleted only when an account has not been accessed for 6 months. This tutorial will use sequences of hemoglobins (Hbs) from differentorganisms to illustrate the properties of scoring matrices. Choose the "Protein Tools" menu button, then choose the "Ndjinn Multiple Database Search"from the menu at the bottom of the page. Biology Workbench has a large number of databases to search, for this exercise, click in the box to left of the database description to choose the "PDBFINDER" database. Search the PDBFINDER database by typing in the PDB ID codes below into the search box at the top of the page. Import the sequences with the following PDBID codes (use the OR operator between each PDB ID code to search for all of the records in the same search):

  • 1T1N from trematomus newnesi (antarctic fish)
  • 1SPG from leiostomus xanthurus (spot croaker)
  • 1QSI from homo sapiens (human)
  • 1IWH from equus cabullus (horse)
  • 1HV4 from anser indicus (goose)
  • 1HBR from gallus gallus (chicken)
  • 1H97 from paramphistomum epiclitum (trematode)
  • 1GVH from escherichia coli (enterobacteria)
The import function in the Workbench requires checking the boxes for all the PDB ID codes that were returned, then hitting the import button.There will be several subunits returned with most of these sequences, and some are duplicate sequences, so delete the following chains byclicking the box on the left of the ID code and selecting "Delete Protein Sequence(s)" from the pull-down menu at the bottom of the page:
  • 1HV4_C
  • 1HV4_D
  • 1HV4_E
  • 1HV4_F
  • 1HV4_G
  • 1HV4_H
  • 1HBR_C
  • 1HBR_D
  • 1H97_B
  • 1QSI_C
  • 1QSI_D
After the above sequences have been deleted, choose "Select All Sequence(s)"from the pull-down menu. Analyze the relatedness of this group of sequences by selecting "ClustalW" from the pull-down menu to perform a multiple alignmentand draw a rooted cladogram. When the ClustalW page appears, before submitting the alignment, scroll down the page and change the "Guide tree display:" to"Rooted".

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Bios 533 bioinformatics. OpenStax CNX. Sep 24, 2008 Download for free at http://cnx.org/content/col10152/1.16
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Bios 533 bioinformatics' conversation and receive update notifications?

Ask