<< Chapter < Page Chapter >> Page >
This course is a short series of lectures on Statistical Bioinformatics. Topics covered are listed in the Table of Contents. The notes were preparedby Ewa Paszek, Lukasz Wita and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

Data analysis.

After scanning, a grid must be placed on the image and the spots representing the arrayed genes must be identified. The background fluorescence is calculated locally for each spot and is subtracted from the hybridization intensities. Comparing the fluorescence intensity of control identifies differentially expressed genes and experimental probes hybridized to each spot, (Freeman et al., 2000; Bowtell, 1999; Knudsen, 2002) Typically, the experimental target sequences are labeled with Cy5, which fluoresces red light (667 nm), and control targets are labeled with Cy3, which fluoresces green light (568 nm). The ratio of red to green signal can then be used as a measure of the effect of the experimental treatment on the expression of each gene. A ratio of 1 (yellow spot) indicates no change in the expression level between experimental and control samples, while a ratio greater than 1 (red spot) indicates increased transcription in the experimental sample, and a ratio less than 1 (green spot) indicates decreased transcription in the experimental sample. A scatter plot is a very useful representation of the expression data; the signal intensities of the experimental and control samples are plotted along the x- and y-axes, and the ratio values are plotted as a distance from the diagonal, (Schena, 2003). The diagonal separates spots with higher activity than the control sample from spots with lower activity than the control. The scatter plot provides a visualization of the fluorescence ratios obtained from the experimental and control samples. One can then easily choose points that represent a several fold increase or decrease in gene expression and focus additional analyses on these genes.

The hybridized microarray.

A hybridized microarray printed by the AECOM robot (Cheung et al., 1999). A 5550-gene mouse cDNA microarray was printed and hybridized to Cye3-dUTP and Cye5-dUTP probes from wild-type and mutant mouse cell lines and imaged using the AECOM laser scanner. Shown is one out off our of the pen tip printing areas region of the array.

With just one experimental condition and a control, the data analysis is limited to a list of regulated genes ranked by the fold-change or by the significance of the change determined in a t test. Normalization of data must be performed to compare separate arrays. With multiple experimental conditions (e.g. time-points or drug doses), the genes are often grouped into clusters that behave similarly under the different conditions. Complex computational methods such as hierarchical clustering or k-means are used to analyze the massive amounts of data generated by these experiments. Gene clusters are visualized with trees or color-coded matrices by placing genes with similar patterns of expression into a clustered group Figure11. Image processing and analysis software is commercially available, and several packages are available as freeware: (External Link) , (External Link) , (External Link) , (External Link) .

Clustering of gene expression patterns.

Clustering of gene expression patterns. a, the ratio of gene expression in control relative to experimental for individual genes is displayed using a color scale. Black indicates no change in expression, while an increase in the experimental relative to the control is shown as red, and a decrease in the experimental relative to the control is shown as green. Genes displaying similar patterns of induction or repression are clustered together. b, clustering of thousands of genes by patterns of gene induction or repression following a treatment, (Campbell and Heyer, 2003).

Microarray analysis of gene expression does have limitations that researchers must consider. In gene expression, the correlation between induced mRNA and induced levels of protein are not always well aligned. Translational and post-translational regulatory mechanisms that impact the activity of various cellular proteins are not examined by DNA microarrays, though the emerging field of proteomics is beginning to address this issue. Other limitations of microarray analysis include the impact of alternative splicing during transcript processing and the limited detectability of unstable mRNAs. Differential gene expression results must be confirmed through direct examination of selected genes. These analyses are typically at the level of RNA blot or quantitative RT-PCR to examine transcripts of a specific gene, and/or detection of protein concentration using immunoblots. Additional studies often include alteration of gene function with targeted mutations, antisense technology, or protein inhibition.

cdna arrays
Oligonucleotide arrays
Gene Networks

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to bioinformatics. OpenStax CNX. Oct 09, 2007 Download for free at http://cnx.org/content/col10240/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to bioinformatics' conversation and receive update notifications?

Ask