It is often assumed that a gene contains exactly the same nucleotides whether it is active or inactive; that is, a β-globin gene that is activated in a red blood cell precursor has the same nucleotides as the inactive β-globin gene in a fibroblast or retinal cell of the same animal. There is a subtle difference, however. In 1948, R. D. Hotchkiss discovered a “fifth base” in DNA, 5-methylcytosine. In vertebrates, this base is made enzymatically after DNA is replicated. At this time, about 5% of the cytosines in mammalian DNA are converted to 5-methylcytosine (Figure 1A). This conversion can occur only when the cytosine residue is followed by a guanosine; in other words, it can only occur at a CpG sequence. Numerous studies have shown that the degree to which the cytosines of a gene are methylated can control the level of the gene’s transcription. Cytosine methylation appears to be a major mechanism of transcriptional regulation in many phyla, but the amount of DNA methylation greatly varies among species. For instance, the plant Arabidopsis thaliana has among the highest percentages of methylated cytosines at 14%; the mouse has 7.6%, and the bacterium E. coli has 2.3% (Capuano et al. 2014). Interestingly, for years researchers thought that the model organisms Drosophila and C. elegans did not have methylated cytosines, yet recent studies using more sensitive methods have detected low levels of DNA methylation at cytosines (0.034% in Drosophila and 0.0019–0.0033% in C. elegans; Capuano et al. 2014; Hu et al. 2015). Currently, using these same high-resolution methods, no cytosine methylation has been found in yeast. Why such varied amounts of DNA methylation exist among species remains an open question.

In vertebrates, the presence of methylated cytosines in a gene’s promoter correlates with the repression of transcription from that gene. In developing human and chick red blood cells, for example, the DNA of the globin gene promoters is almost completely unmethylated, whereas the same promoters are highly methylated in cells that do not produce globins. Moreover, the methylation pattern changes during development (Figure 1B). The cells that produce hemoglobin in the human embryo have unmethylated promoters in the genes encoding the ε-globins (“embryonic globin chains”) of embryonic hemoglobin. These promoters become methylated in the fetal tissue as the genes for fetal-specific γ-globin (rather than the embryonic chains) become activated (van der Ploeg and Flavell 1980; Groudine and Weintraub 1981; Mavilio et al. 1983). Similarly, when fetal globin gives way to adult (β) globin, promoters of the fetal (γ) globin genes become methylated.

Figure 1 Methylation of globin genes in human embryonic blood cells. (A) Structure of 5-methylcytosine. (B) The activity of the human β-globin genes correlates inversely with the methylation of their promoters. (After Mavilio et al. 1983.)