Topic 2.2 Recombination Mapping and Gene Cloning: Overview

Topic 2.2 Recombination Mapping and Gene Cloning: Overview

Mutagenesis with either radiation or chemicals induces nucleotide changes randomly and in the genome but without leaving an easily detectable mark, such as a transposon or T-DNA insert. There are several ways to map a mutation to a chromosome and ultimately clone the gene that contains the mutation. Web Figure 2.2.A shows a method called map-based cloning. This method takes advantage of small differences in the DNA sequence between two closely related but somewhat different varieties of the same species. In Arabidopsis such varieties are called ecotypes. Columbia and Landsberg erecta are two examples of the most commonly used ecotypes for gene mapping in Arabidopsis. Homozygous genetic loci that differ between two ecotypes or individuals are called polymorphic sequences and can serve as genetic markers. For example, two alleles of a gene can serve as a genetic marker, but even non-genic regions that have small differences between ecotypes can serve as markers along a chromosome. If the location of many such polymorphic sequences is known, the position of a new mutation can be located on the genetic map based on tight genetic linkage with one or more of the known markers. Once a map position is established, the region around this position is sequenced in the mutant and compared with the sequence of the wild type. We will first describe the general approach to analyzing a mapped population and then explain the molecular tools needed to determine the molecular identity of each marker.

Map-based cloning involves the screening of a mapping population

Let’s assume a recessive mutation was found in an individual from the ecotype Columbia (Col). The first step in mapping the mutated gene would be to cross it to an individual from a different ecotype—let’s assume in our example Landsberg erecta (Ler)—so that the offspring (the F1 generation) would be heterozygous at every genetic locus, including the mutation of interest. These offspring would then be allowed to self, giving rise to the F2 generation. Given the recessive nature of the mutation in our example, all individuals of the F1 generation would have a wild-type phenotype, but in the F2 generation one-quarter of the offspring would segregate for the mutant phenotype. Also, because of crossing over during F1 meiosis, the chromosomes in the F2 generation would contain different combinations of Col- and Ler-derived DNA. For example, marker A might be derived from Col and markers B and C from Ler, and so forth (see Web Figure 2.2.A). Keeping in mind that the gene of interest must be on a Col chromosomal segment and, due to its recessive nature, the mutant phenotype will only be visible if the segment carrying the gene of interest is homozygous, we can exclude all plants that do not show both homozygosity for Col and the mutant phenotype. The next step in the analysis is to determine the origin of markers (Col or Ler) that might be in proximity to the unknown gene. Since Arabidopsis has five chromosomes, initially one can start with ten polymorphic markers, one for each chromosome arm, and test which of these markers consistently segregates with the mutant phenotype. In our example we have already narrowed the search to one chromosome. Now we have to see where on the chromosome the gene of interest is located. If, for example, in several dozen individuals of the F2 mapping population no plant showing the mutant phenotype is homozygous Col genotype for marker A, we can assume that the mutation is not closely linked to marker A and we can proceed to test the same plants now for marker B. At some point we would expect to find linkage between a marker and the mutant phenotype. Linkage would be confirmed by determining that most mutant individuals are homozygous for a given marker from the Col ecotype and the mutant phenotype. This means that the gene mutated in our original Col parent must be close to this marker. If we started the analysis with markers from each chromosome arm (in the case of Arabidopsis there are five chromosomes), we could easily screen individuals from the mapping population for linkage between mutant phenotype and one of ten markers (one for each chromosome arm). In practice the best way to find a closely linked marker is to exclude any marker that is not closely linked to the mutant locus. In our example in Web Figure 2.2.A, only individuals 2 through 4 would be analyzed (because individual 1 is a heterozygote and cannot be phenotypically mutant). Individual 2 would help us exclude the region around marker A, individual 3 would exclude an even larger area anywhere north of (above) marker B, and individual 4 would exclude the region south of (below) marker C. So the only region not excluded in this example is the region between markers B and C.

Markers can be distinguished by using molecular fingerprinting analysis

Above you learned how to set up a mapping population and analyze data from a mapped population, but how do you actually screen each individual for the molecular identity of its markers? First of all, remember that the mutant allele can only be on a chromosome segment that originally came from the mutagenized Col parent and that the goal of mapping is to find a gene marker that is physically close to the mutant gene and segregates with it. Such a marker would have to be on the Col chromosome but could not be on the Ler chromosome. To find Col markers that co-segregate with the mutation, DNA is extracted from F2 individuals with the mutant phenotype and from the parents (Col and Ler). An easy way to detect a polymorphism between Col and Ler is to first amplify the DNA by PCR for a specific marker, here marker A, and then use a restriction enzyme that discriminates between Col and Ler at this marker by cutting only one of the two PCR products. This technique is called cleaved amplified polymorphic sequence analysis, or CAPS analysis for short. In our example, cleavage results in two smaller bands in Col-derived DNA but in only one, larger band in DNA from Ler. This is because the enzyme used for the CAPS analysis in this example recognized a sequence that is present in Col marker A but not in Ler marker A. If an F2 plant is homozygous for Col marker A, it should show only the two smaller bands. Likewise, if the plant is homozygous for Ler marker A, it should show only one, larger band. However, if the plant is heterozygous at marker A, we should expect to see all three bands, since both chromosomes are represented in this plant at the locus for marker A.

Fine-mapping eventually can lead to the molecular identity of the mutation

If we wanted to get closer to the gene of interest, we would have to use more markers that are clustered around the candidate area—between markers B and C in our example We would then determine close linkage to one or two of these markers, until linkage was close enough to sequence the area between two markers and identify the mutation molecularly. Using the whole-genome sequence information that is available online at The Arabidopsis Information Resource (TAIR), we could then check the wild-type sequence for Col in the DNA segment to which we just mapped the mutation. Then all we would have to do is compare the sequence of the same segment of DNA from the mutant with that of the wild type and check if there is indeed a difference in the DNA sequence that could explain the phenotypic change. If there were a difference, we would still need to show that the wild-type DNA can restore the normal phenotype in the mutant plant. To do this, we would transform the mutant with the wild-type sequence and observe the phenotype of the next generation. If the phenotype was restored to wild type, we would have good evidence that the mutant phenotype of our mutant was in fact due to the disrupted gene that we mapped. 

Web Figure 2.2.A  Map-based cloning approach in Arabidopsis (see text for details).