<< Chapter < Page | Chapter >> Page > |
In 1953 Watson and Crick unlocked the structure of the DNA molecule and set into motion the modern study of genetics.This advance allowed our study of life to transcend the wet realm of proteins, cells, organelles, ions, andlipids, and move up into more abstract methods of analysis. By discovering the basic structure of DNA we had received ourfirst glance into the information-based realm locked inside the genetic code.
The human genome contains 3 billion chemical nucleotide bases (A,C, T and G). About 30,000 genes are estimated to be in the humangenome. The human genome has physical three-dimensional structure. The genome is 6 feet (2 meters) in length and is packedin the nucleus of our cells into a structure which is only 0.0004 inches across (the head of a pin). The genome is divided among 24chromosomes (22 pairs of autosomes and one pair of sex chromosomes (X and Y)), and that genes lie on specific chromosomes. Humanchromosomes are arranged according to size with Chromosome 1 being the largest, and the Y chromosome being the smallest. Matt Ridley'sfascinating book Genome gives a great introduction to our chromosomes and the genes they contain. Chromosome 1 is believed to have 2968 genes, while the Ychromosome has 231 genes. To learn more about chromosomes, visit GeneMap99 , a site maintained by the NCBI. Here is a diagrammatic representationof the 24 chromosomes. Here is what a chromosome looks like under an electron microscope.
The average human gene contains about 3000 bases. Sizes of human genes vary greatly. The largest known humangene is dystrophin (a muscle protein) at 2.4 million bases. The smallest genes are a little over a hundred base pairs long. Lessthan 2% of the genome codes for proteins. Repeated sequences that are not involved in coding for proteins (sometimes called "junkDNA") make up at least 50% of the human genome. These repetitive sequences play an important role in chromosome structure anddynamics. Over time, these repeats are believed to reshape the genome by rearranging it, creating entirely new genes, andmodifying and reshuffling genes. Surprisingly, genes are not distributed uniformly through the human genome. Genes appear to beconcentrated in sections of the genome with high GC content, with vast areas of non-coding DNA in between. There are long stretchesof C and G repeats adjacent to gene-rich areas. These CpG islands are believed to regulate gene activity, and they serve as markersfor gene-rich locations on the genome. We do not yet know the function of over 50% of the discovered genes. A great site tolearn more about DNA is the DNAi site maintained by the HHMI.
Notification Switch
Would you like to follow the 'Statistical machine learning for computational biology' conversation and receive update notifications?