Chapter 6 Answers to self-check questions

Molecular analysis and interpreting molecular data

6.1 What are the similarities and distinctions for the genomic analysis pathways for aCGH and GWAS?

A Both aCGH and GWAS are methods used to determine the structural abnormalities across the genome. They employ statistical software programs applied in a stepwise fashion to analyse numerical values representing changes across the genome. The main distinction is that aCGH is low resolution technique used to detect large genome structural abnormalities such as gains and losses across the chromosome, whereas GWAS is higher resolution and uses SNPs to detect more focussed areas exhibiting structural abnormalities such as copy number variation. In terms of bioinformatics analysis, aCGH employs the calculation of log2 ratio for the assessment of chromosomal gain or loss whereas GWAS employs variant calling software to detect the SNPs and uses odds ratio to detect any areas in the genome that show significant change based on the distribution of SNPs.

6.2 What does the FASTQ file serve as and what type of information can be found in this file?

FASTQ file is the standard file obtained from NGS run. It provides details for the sequence data and the quality for each base call.

6.3 What are the key differences in bioinformatics between single and dual coloured gene expression microarray analysis?

The main difference in bioinformatics analysis between the two gene expression microarrays is with normalisation. In dual colour both the control and experimental groups occur on the same spot so the bioinformatics needs to carry out LOESS normalisation before doing the gene expression is calculated from the difference in colour between the control and the experimental samples. However, in single colour microarray the control and experimental group are carried out on separate arrays and therefore two types of normalisation need to be carried out; firstly, interarray normalisation which consist of scaling using control probes on the array and secondly intra-array normalisation which consist of normalisation either using median or mean centred algorithms.

6.4 Why is it better to use RNA sequencing to detect fusion transcript over gene expression microarrays?

RNA sequencing provides information on the full length RNA sequence and is thus more suited to detection a fusion transcript as well as translocations. By contrast gene expression microarray is limited to what probes are present on the array and if probes that map to the location of breakpoint are not present then gene expression microarray will not detection the fusion. In addition RNA sequencing has higher sensitivity than gene expression microarray and is therefore more suited to detected low copy translocations.

6.5 What is the main difference between targeted and whole genome bisulphite sequencing methylation?

Targeted methylation provides higher coverage but at focussed locations across CpG islands whereas whole genome bisulphite sequencing provide uneven coverage across the whole methylome. However, since the whole methylome output shows the CpG island structure across the methylome, using various mathematical algorithms the landscape can be captured even with low uneven coverage.

6.6 In a break apart assay what type of in situ hybridisation signal would indicate the presence of gene fusion in an intact nucleus?

When gene fusion has occurred then two spatially distinct hybridisation signals of different colours will be observed in the nucleus. Accordingly, if fluorescent probes used in the assay were red and green these two colours will be seen. When gene fusion has not occurred then the hybridised probes will be very close together and may appear as an apparently single yellow dot in the nucleus.

6.7 What would be the H score, see Table 6.1, for a population of 100 cancer cells exhibiting the following staining for protein X?

50 cells were unstained, 20 cells were stained at weak intensity, 20 cells were stained at moderate intensity and 10 cells were stained at strong intensity.

The cumulative score is:

For 20 cells stained at weak intensity =20 plus

For 20 cells stained at moderate intensity, 20 x2 =40 plus

For 10 cells stained at strong intensity, 10 x3 =30

Giving a final H score of 90.

6.8 In what circumstances would automated image analysis be of use in the diagnosis of cancer?

Image analysis could make a contribution to the diagnosis of cancer by automatically calculating gene copy number, translocations and fusions at DNA level in intact nuclei. It could also be used to provide quicker and more accurate assessment than semi-quantitative methods for the calculation of gene expression of RNA and protein in cancer cells.