This again seems extremely complicated, but there’s a very good reason that this complex mechanism has been favoured by evolution. It’s because it enables a cell to use a relatively small number of genes to create a much bigger number of proteins. The way this works is shown in Figure 3.3.
Figure 3.3
The DNA molecule is shown at the very top of this diagram. The exons, which code for stretches of amino acids, are shown in the dark boxes. The introns, which don’t code for amino acid sequences, are represented by the white boxes. When the DNA is first copied into RNA, indicated by the first arrow, the RNA contains both the exons and the introns. The cellular machinery then removes some or all of the introns (the process known as splicing). The final messenger RNA molecules can thereby code for a variety of proteins from the same gene, as represented by the various words shown in the diagram. For simplicity, all the introns and exons have been drawn as the same size, but in reality they can vary widely.The initial mRNA contains all the exons and all the introns. Then it’s spliced to remove the introns. But during this splicing some of the exons may also be removed. Some exons will be retained in the final mRNA, others will be skipped over. The various proteins that this creates may have quite similar functions, or they may differ dramatically. The cell can express different proteins depending on what that cell has to do at a particular time, or because of different signals that it receives. If we define a gene as something that encodes a protein, this mechanism means that just 20,000 or so genes can code for far more than just 20,000 proteins.
Whenever we describe the genome we talk about it in very two-dimensional terms, almost like a railway track. Peter Fraser’s laboratory at the Babraham Institute outside Cambridge has published some extraordinary work showing it’s probably nothing like this at all. He works on the genes that code for the proteins required to make haemoglobin, the pigment in red blood cells that carries oxygen all around the body. There are a number of different proteins needed to create the final pigment, and they lie on different chromosomes. Doctor Fraser has shown that in cells that produce large amounts of haemoglobin, these chromosome regions become floppy and loop out like tentacles sticking out of the body of an octopus. These floppy regions mingle together in a small area of the cell nucleus, waving about until they can find each other. By doing this, there is an increased chance that all the proteins needed to create the functional haemoglobin pigment will be expressed together at the same time[18]
.Each cell in our body contains 6,000,000,000 base-pairs. About 120,000,000 of these code for proteins. One hundred and twenty million sounds like a lot, but it’s actually only 2 per cent of the total amount. So although we think of proteins as being the most important things our cells produce, about 98 per cent of our genome doesn’t code for protein.
Until recently, the reason that we have so much DNA when so little of it leads to a protein was a complete mystery. In the last ten years we’ve finally started to get a grip on this, and once again it’s connected with regulating gene expression through epigenetic mechanisms. It’s now time to move on to the molecular biology of epigenetics.
Chapter 4. Life As We Know It Now
The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them.
So far this book has focused mainly on outcomes, the things that we can observe that tell us that epigenetic events happen. But every biological phenomenon has a physical basis and that’s what this chapter is about. The epigenetic outcomes we’ve described are all a result of variations in expression of genes. The cells of the retina express a different set of genes from the cells in the bladder, for example. But how do the different cell types switch different sets of genes on or off?