The basic techniques of molecular biology are covered in many introductory biology texts as well as in cell biology and genetics books. However, if you are unfamiliar with these techniques, we hope you will enjoy this introduction provided here. We will outline the fundamental methodologies of cloning, Southern blotting, DNA sequencing, and library construction. However, if you wish to get detailed protocols for these procedures, please go to any of the following sites:

Nucleic acid hybridization

Most techniques of eukaryotic gene analysis are based on nucleic acid hybridization. This technique involves annealing single-stranded pieces of RNA and DNA to allow complementary strands to form double-stranded hybrids. For example, if DNA is cut into small pieces and each piece dissociated into two single strands and denatured, each strand in the solution should find and reunite with its complementary partner, given sufficient time. The conditions of renaturation must be such that specific binding between complementary strands is maintained while nonspecific matchings are dissociated. This is usually achieved by varying the temperature or the ionic conditions in the solution in which renaturation is taking place (Wetmur and Davidson, 1968). Similarly, RNA synthesized from a particular region of DNA would be expected to bind to the strand from which it was transcribed (Figure 1). Thus, RNA is expected to hybridize specifically with a gene that encodes it. To measure this hybridization, one of the nucleic acid strands (the probe) is usually labeled by the incorporation of radioactive nucleotides. One technical problem that originally plagued nucleic acid hybridization studies was the difficulty of getting enough radioactivity into the RNA molecule. This problem is circumvented by isolating the RNA and making a complementary DNA (cDNA) copy in the presence of radioactive precursors. This can be done in a test tube containing the RNA, a short stretch of DNA (called a primer), radioactive DNA precursors, and the viral enzyme reverse transcriptase. This enzyme is capable of making DNA from an RNA template (Figure 2). Because the DNA is synthesized in vitro, one need not worry about the dilution of the radioactive precursors. Furthermore, the cDNA can hybridize with both the gene that produced the RNA (albeit the other strand) and the RNA itself, making it extremely useful in detecting small amounts of specific RNAs.

Figure 1   Nucleic acid hybridization. (A) If the DNA helix is separated into two strands, the strands should reanneal, given the appropriate ionic conditions and time. (B) Similarly, if DNA is separated into its two strands, RNA should be able to bind to the genes that encode it. If present in sufficiently large amounts compared with the DNA, the RNA will replace one of the DNA strands in this region.
Figure 2   Method for preparing complementary DNA (cDNA). Most mRNA contains a long stretch of adenosine residues (AAAn) at the 3′ end of the message; therefore, investigators anneal a primer consisting of 15 deoxythymidine residues (dT15) to the 3′ end of the message. Reverse transcriptase then transcribes a complementary DNA strand, starting at the dT15 primer. The cDNA can be isolated by raising the pH of the solution, thereby denaturing the double-stranded hybrid and cleaving the RNA.

Cloning from genomic DNA As early as 1904, Theodor Boveri despaired that the techniques of his time might never allow him to study how genes create embryos. A particular type of gene amplification technique was needed:

For it is not cell nuclei, not even individual chromosomes, but certain parts of certain chromosomes from certain cells that must be isolated and collected in enormous quantities for analysis; that would be the precondition for placing the chemist in such a position as would allow him to analyse [the hereditary material] more minutely than the morphologists.

However, since the 1970s, nucleic acid hybridization has enabled developmental biologists to do just what Boveri wanted: to isolate and amplify specific regions of the chromosome. The main technique for isolating and amplifying individual genes is called gene cloning. The first step in this process involves cutting nuclear DNA into discrete pieces by incubating the DNA with a restriction endonuclease (more commonly called a restriction enzyme). These endonucleases are usually bacterial enzymes that recognize specific sequences of DNA and cleave the DNA at these sites (Nathans and Smith 1975). For example, when human DNA is incubated with the enzyme BamHI (from Bacillus amyloliquifaciens strain H), the DNA is cleaved at every site where the sequence GGATCC occurs. The products are variously sized pieces of DNA, all ending with G on one end and GATCC on the other (Figure 3). These pieces are often called restriction fragments.

Figure 3   The general protocol for cloning DNA, using as an example the insertion of a human DNA sequence into a plasmid with one BamHI-sensitive site.

The next step in gene cloning is to incorporate these restriction fragments into cloning vectors. These vectors are usually circular DNA molecules that replicate in bacterial cells independently of the bacterial chromosome. Either drug-resistant plasmids or specially modified viruses (which are especially useful for cloning large DNA fragments) are used. For instance, a vector can be constructed to have only one BamHI-sensitive site. This vector can be opened by incubating it with that restriction enzyme. After being opened, it can be mixed with the BamHI-fragmented human DNA. In numerous cases, the cut DNA pieces will become incorporated into these vectors (because their ends are complementary to the vectoris open ends), and the pieces can be joined covalently by placing them in a solution containing the enzyme DNA ligase. The whole process yields bacterial plasmids that each contain a single piece of human DNA. These are called recombinant plasmids or, usually, recombinant DNA (Cohen et al., 1973; Blattner et al., 1978).The plasmid illustrated in Figure 3 is pUC18, a cloning vector often used by molecular biologists (Vierra and Messing, 1982). It contains (1) a drug-resistance gene, ApR, which makes the bacterium immune to ampicillin and allows researchers to select for those bacteria that have incorporated a plasmid; (2) an origin of DNA replication that enables the plasmid to replicate hundreds of times in each bacterium; and (3) a polylinker, a short, artificial stretch of DNA that contains the restriction enzyme sites for several of these endonucleases. The polylinker resides within a lacZ gene that encodes E. coli b-galactosidase. The polylinker is short enough (and has the correct number of base pairs) so that it does not interfere with the enzymatic activity of the b-galactosidase. The cloning procedure begins when the restriction fragments of the nuclear DNA are mixed with the opened pUC18 plasmids and are then ligated shut. The putative recombinant plasmids made in this manner are then incubated with ampicillin-sensitive E. coli cells that lack a b-galactosidase gene. Even though the bacteria and the plasmids are mixed together under conditions that encourage the bacteria to take in plasmids, not every bacterium incorporates a plasmid. To screen for those bacteria that have incorporated plasmids, the treated E. coli cells are grown on agar containing ampicillin. Only those bacteria that have incorporated a plasmid (with its dominant ampicillin-resistance gene) survive.

But not every plasmid has incorporated a foreign gene, because it is possible for the "sticky ends" of the restriction enzyme site to renature with themselves. To distinguish bacterial colonies that have incorporated foreign DNA from those that have not, the agar also contains a dye called X-gal. This compound is colorless, but when acted upon by b-galactosidase, it forms a blue precipitate.[i] Thus, if a plasmid has not incorporated a restriction fragment into its restriction enzyme site in the polylinker, the b-galactosidase (lacZ) gene is functional, and the resulting b-galactosidase turns the dye blue. The result is the appearance of "blue colonies." However, if the plasmid has taken up a DNA fragment, the b-galactosidase gene is destroyed by the insertion. These bacteria will not turn the dye blue; they produce colorless colonies on the agar. Colorless colonies are then screened for the presence of the particular gene. Cells from each of these colonies are placed on a paper-thin nitrocellulose or nylon filter. When these cells are lysed, their DNA gets stuck on the filter. Next, the DNA strands are separated by heating, and the filter is incubated in a solution containing the radioactive RNA (or its cDNA copy) of the gene one wishes to clone. (In some cases, the sequence of the mRNA or gene is not yet known, and one has to guess the sequence from the amino acid sequence of the protein.) If a plasmid contains that gene, its DNA should be on the filter, and only that DNA should be able to bind the radioactive RNA or cDNA probe. Therefore, only those areas will be radioactive. The radioactivity of these regions is detected by autoradiography. Sensitive X-ray film is placed over the treated paper. The high-energy electrons emitted by the radioactive RNA sensitize the silver grains in the film, causing them to turn dark when the film is developed. Eventually, a black spot is produced over each colony containing the recombinant plasmid carrying that particular gene (see Figure 3). This colony is then isolated and grown, producing billions of bacteria, each containing hundreds of identical recombinant plasmids.

The recombinant plasmids can be separated from the E. coli chromosome by centrifugation, and incubating the plasmid DNA with BamHI releases the foreign DNA fragment that contains the gene. This fragment can then be separated from the plasmid DNA, so the investigator has micrograms of purified DNA sequences containing a specific gene. Although this procedure sounds very logical and easy, the number of colonies that must be screened is often astronomical. The number of random fragments that must be cloned to obtain the gene we want gets larger with the increasing complexity of the organismis genome.[ii] To detect a particular gene from a mammalian genome, millions of individual clones must be screened.

DNA hybridization: Within and across species

Clones can be screened by any radioactive stretch of nucleotides. Therefore, the genes cloned from one organism can be probed with radioactive cDNAs derived from the mRNAs of another species. One of the most exciting findings of modern developmental biology has been that genes used for specific developmental processes in one organism may be used for similar processes in other organisms. Drosophila has been critical in the discovery of these genes. Starting with Morgan, these genes have been mapped, and in the 1960s, E. B. Lewis confirmed that some of these genes are responsible for the formation of basic body parts. One of these, Antennapedia, is a gene whose protein product is essential for inhibiting head structures from forming in the thorax. If the gene is missing, antennae grow where the legs should be. If the gene is expressed in the head (as it is in a particular mutant), the fly develops an extra set of legs coming out of its eye sockets. Could such a gene exist in vertebrates?

Evidence for such genes in vertebrates came first from DNA blots, sometimes called Southern blots after their inventor, E. M. Southern (1975). DNA from numerous vertebrate and invertebrate organisms was treated with a restriction enzyme, and the resulting DNA fragments were separated on an electrophoresis gel. The mixtures of fragments were placed into slots on one side of a gel, and an electric current was passed through the gel. The negatively charged DNA fragments migrated toward the positive pole, the smaller fragments moving faster than the larger ones.[iii] However, hybridization cannot be done inside a gel; the DNA must be transferred to a flat surface, and this is done by blotting. After denaturing the DNA strands in alkali, investigators returned the gel to a neutral pH, then placed it on wet filter paper atop a plastic support (McGinnis et al., 1984; Holland and Hogan, 1986). Nitrocellulose paper (capable of binding single-stranded DNA) was placed directly over the gel and covered with multiple layers of dry paper towels. The filter paper beneath the gel extended into a trough of high-ionic-strength buffer. The buffer traveled through the gel up through the nitrocellulose filter and into the towels. The DNA was brought up through the gel by this flow of buffer, but it was stopped by the nitrocellulose filter; thus, the DNA was transferred from the gel to the nitrocellulose paper. After baking the DNA fragments onto the nitrocellulose paper (otherwise they would have come off), the DNA fragments were incubated with radioactive cDNA from a portion of the Drosophila Antennapediagene. An autoradiogram of the nitrocellulose paper showed where the radioactive DNA had found its match. The results from these experiments (Figure 5) showed that even vertebrates (mice, humans, and chicks) have genes that hybridize to these sequences. This radioactive section of the Antennapedia gene was then used to screen a genomic library of DNA clones derived from the genome of these different species. Investigators found clones containing genes that resemble Antennapedia, and these genes were revealed to be extremely important in the formation of the vertebrate body axis.

Figure 4   Southern blots of various organisms’ DNA using a radioactive probe from the Antennapedia gene of Drosophila melanogaster. Because we do not expect the sequences between such diverged species to be perfectly identical, the stringency of the hybridization is lowered by changing the salt conditions. (Such low-stringency blots across phyla are colloquially refered to as "zoo blots," for obvious reasons.) Autoradiography shows that Drosophila genes contain several portions that are like Antennapedia genes in structure and that many organisms contain several genes that will hybridize this radioactive gene fragment, suggesting that Antennapedia-like genes exist in these organisms. The numbers beside the blots indicate size of bands, in kilobases. (From McGinnis et al., 1984, courtesy of W. McGinnis.)

DNA sequencing

Sequence data can tell us the structure of the encoded protein and can identify regulatory DNA sequences that certain genes have in common. The simplicity of the Sanger "dideoxy" sequencing technique (Sanger et al., 1977) has made it a standard procedure in many molecular biology laboratories. One starts with the vector carrying the cloned gene and isolates a single strand of the circular DNA. One then anneals a radioactive primer of DNA (about 20 base pairs) complementary to the vector DNA immediately 3Y to the cloned gene. (Because these vector sequences are known, oligonucleotide primers can be readily synthesized or purchased commercially.) The primer has a free 3Y end to which more nucleotides can be added. One places the primed DNA and all four deoxyribonucleoside triphosphates into four test tubes. Each of the test tubes contains the polymerizing subunit of DNA polymerase and a different dideoxynucleoside triphosphate: one tube contains dideoxy-G, one tube contains dideoxy-A, and so forth. The structures of the deoxynucleotides and the dideoxynucleotides are shown in Figure 5. Whereas a deoxyribonucleotide has no hydroxyl (OH) group on the 2Y carbon of its sugar, a dideoxyribonucleotide lacks hydroxyl groups on both the 2Y and 3Y carbons. So even though a dideoxyribonucleotide can be bound to a growing chain of DNA by DNA polymerase, it stops the chainis growth because, lacking a 3Y hydroxyl group, no new nucleotide can bind to it. Thus, when the DNA polymerase is synthesizing DNA from the primer, the new DNA will be complementary to the cloned gene. In the tube with dideoxy-A, however, every time the polymerase puts an A into the growing chain, there is a chance that the dideoxy-A will be placed there instead of the deoxy-A. If this happens, the chain stops. Similarly, in the tube with dideoxy-G, the chain has the potential to stop every time a G is inserted. (The process has been likened to a Greek folk dance in which some small percentage of the potential dancers have one arm in a sling.) Because there are millions of chains being made in each tube, each tube will contain a population of chains, some stopped at the first possible site, some at the last, and some at sites in between. The tube with dideoxy-A, for instance, will contain chains of different discrete lengths, each ending at an A residue. The resulting radioactive DNA fragments are separated by electrophoresis. The result is a "ladder" of fragments wherein each "rung" is a nucleotide sequence of a different length. By reading up the ladders, one obtains the DNA sequence complementary to that of the cloned gene.

Figure 5   Comparison of deoxynucleotides and dideoxynucleotides. (A) Structures of the two types of nucleotides. The difference is highlighted. (B) The 3′ end of a chain that has been terminated by incorporation of dideoxynucleotide because it has no free 3′ hydroxyl group for further DNA polymerization.

Analyzing mRNA through cDNA libraries

Now we can return to the specificity of mRNA transcription: Can we isolate populations of mRNA that characterize certain cell types and are absent in all others? To find these RNAs, we can "clone" the mRNA from different types of cells and compare them. As shown in Figure 6A, this is done by taking the messenger RNAs from a cell or tissue and converting them into complementary DNA strands. By taking the procedure a step further (with the aid of DNA polymerase and S1 nuclease), we can change this population of single-stranded cDNA into a population of double-stranded cDNA pieces. These strands of DNA can be inserted into plasmids by adding the appropriate "ends" onto them with DNA ligase. Appending a GATCC/G fragment onto the blunt ends of such a DNA piece creates an artificial BamHI restriction cut and enables the piece to be inserted into a virus or plasmid cut with that enzyme (Figure 6B).

Such collections of clones derived from mRNAs are often called libraries. Thus, we can have a 16-day embryonic mouse liver library, representing all the genes active in making embryonic liver proteins. We can also have a Xenopus vegetal oocyte library, representing messages present only in a particular part of that cell. Genes cloned in this manner are very important because they lack introns. When added to bacterial cells, these genes can be transcribed and then translated into the proteins they encode.

Libraries have been extremely useful in studying development, as seen in the efforts of Wessel and co-workers (1989) to detect differences in the RNAs in different parts of the gastrulating sea urchin embryo. To find endoderm-specific mRNAs in sea urchins, Wessel and co-workers prepared a cDNA library from gastrulating embryos. The mRNA of these samples (most of the RNA of eukaryotic cells is ribosomal) was isolated by running the samples through oligo-dT beads that capture the poly(A) tails of the messages (see legend to Figure 3). Then the mRNA population was converted into a cDNA population by using reverse transcriptase (see Figure 6A). By using E. coli polymerase I, the single-stranded cDNA was then made double-stranded. Next, commercially available EcoRI "ends" were ligated onto the double-stranded cDNAs. This made them clonable into vectors that were cut with EcoRI restriction enzyme. The DNA was then mixed with the arms of a genetically modified l phage (see Figure 6B). This phage is so constructed that when grown in a petri dish, the phages that have incorporated the DNA (and thus destroyed the b-galactosidase gene) produce colorless plaques (Figure 6C). In this way, approximately 4 million recombinant phages were generated, each containing a cDNA representing an mRNA molecule.

The next steps involved screening the recombinant phages. Which ones might represent mRNAs found in endoderm and not in the other cell layers? Wessel and his colleagues isolated the mRNA populations from mesoderm, ectoderm, and endoderm. They then made labeled cDNAs from each of the mRNA populations using radioactive precursors. They now had three collections of radioactive cDNA molecules, each representing the mRNA population from one of the three germ layers.

The recombinant phages representing the mRNAs of the gastrulating sea urchin embryo were grown, and samples of numerous coloniesneach containing thousands of phagesnwere placed on two nitrocellulose filters (Figure 6D). These samples were then placed in alkaline solutions to lyse the phages and make the DNA single-stranded. One of these filter papers was incubated with radioactive cDNA made from the total mRNA of the endoderm; the other paper was incubated with radioactive probes to both mesoderm and ectoderm. The filters were then washed to remove any unhybridized radioactive cDNA, dried, and exposed on X-ray film. If an mRNA were present in the endoderm but not in either the ectoderm or mesoderm, the recombinant DNA made from that message should bind radioactive cDNA from the endoderm but should not find an mRNA anywhere else. As a result, that spot of recombinant DNA from the endoderm should be radioactive (since it bound radioactive cDNA from the endoderm), but the same clone should not be radioactive when exposed to ectodermal or mesodermal mRNA. This was found to be the case. One recombinant phage in particular only bound radioactive cDNA made from endodermal mRNA; hence, it represented an mRNA found in the endoderm and not in the mesoderm or ectoderm. The phage containing this gene can now be grown in large quantities and characterized.

Figure 6   Protocol used to make cDNA libraries. (A) Messenger RNA is isolated and made into cDNA. This cDNA is made double-stranded, and restriction fragment ends are added. (B) The cDNA "genes" can then be inserted into specially modified vectors, in this case bacteriophages. (C) Phages containing the recombinant DNA will lyse E. coli, forming plaques. Biochemical techniques can distinguish plaques of recombinant phages from those that lack the inserted gene. (D) The plaques are transferred to nitrocellulose paper and treated with alkali to lyse the phages and denature the DNA in place. These filters are then incubated in radioactive probes (usually cDNA) from a tissue. For the differential cDNA library screening discussed in the text, the same phage library was screened with radioactive probes from two different tissues, allowing the researchers to look for an mRNA that would be found in one type of tissue but not in the other.

Northern Blotting

We can determine the temporal and spatial locations of RNA expression by “running” an RNA blot, often referred to as a northern blot. An investigator extracts RNA from embryos at different stages of development, or from different organs of the same embryo. The investigator then places these RNA samples side by side at one end of a gel and runs an electric current through the gel. The smaller the RNA, the faster it moves through the gel. Thus, different RNAs are separated by their sizes. This technique is called electrophoresis.[iv]

The separated RNAs are transferred to a nitrocellulose paper or nylon membrane filter. The RNA-containing filter is then incubated in a solution containing a radioactively labeled single-stranded DNA fragment from a particular gene (Figure 7A–D). The probe binds only to those regions of the filter where the target RNA (to which it is complementary) is located. If the mRNA for that gene is present in a sample, the labeled DNA will bind to it and can be detected by autoradiography. X-ray film is placed above the filter and incubated in the dark. The localized radioactivity in the probe reduces the silver in the X-ray film, and grains form. When the film is viewed, black spots appear directly above the places where the radioactive DNA is bound (Figure 7E). Autoradiographs of this type, in which RNAs from several stages or tissues are compared simultaneously, are called developmental northern blots.

Figure 7   Developmental Northern blotting. (A–E) Procedure for Northern blotting. (A) RNA is isolated from various tissues and is separated by size using gel electrophoresis. (B) The gel is then placed on a paper wick, which absorbs an ionic solution from a trough. (C) A filter that traps RNA is placed above the gel, and blotting paper is placed above the filter. Capillary action draws the solution through the gel, trapping the RNA on the filter. (D) The filter is incubated with radioactive single-stranded DNA complementary to the mRNA of interest. (E) After any unbound DNA is washed off, autoradiography localizes the mRNA in the samples that contain it. (F) Drawing of a developmental Northern blot showing the presence of Pax6 mRNA in the eye, brain, and pancreas of a mammalian embryo. (F after Ton et al. 1991.)

Figure 7F shows a developmental northern blot used to investigate the expression of the Pax6 protein in the mammalian embryo. Pax6 is critical for normal eye development; mutations in the Pax6 gene result in small eyes (in heterozygous mice) or no eyes or nose (in mice or humans homozygous for the loss-of-function mutation). The northern blot shows that this gene is expressed in the embryo in the brain, eyes, and pancreas, but in no other tissue.

[i] The dye is 5-bromo-4-chloroindole, and it is blue unless complexed with a molecule such as galactose. The β-galactosidase encoded by the plasmid gene cleaves the galactose off the dye, allowing the dye to achieve its blue conformation.

[ii] Complexity refers to the number of different types of genes within a nucleus. Although millions of clones must be screened, about 100,000 colonies can now be screened on a single plate. Another common way of screening the clones is to use a plasmid that has its restriction enzyme site near a strong bacterial promoter (such as the one for β-galactosidase). The bacteria will transcribe the cDNA and translate it into protein. After the bacterial colonies are lysed on filter paper, the proteins stick to the paper and can be found by antibodies directed against that protein. This is called expression cloning, and the plasmids are referred to as expression vectors.

[iii] Given the same charge-to-mass ratio, smaller fragments obtain a faster velocity than larger ones when propelled by the same energy. This is a function of the kinetic energy equation, E = ?mv2. Solving for velocity, we find that it is inversely proportional to the square root of the mass.

[iv] Given the same charge-to-mass ratio, smaller RNA fragments obtain a faster velocity than larger ones when propelled by the same energy (i.e., larger fragments move more slowly than smaller fragments). This property is a function of the kinetic energy equation, E = 1/2 mv2. Solving for velocity, we find that velocity is inversely proportional to the square root of the mass.