Quantitative Analysis: Inferential Statistics and Multivariate Analysis

Click on each question to check your answer.

Short Answer Questions

1. Explain the characteristics of a normal distribution.

All normal curves and the normal distributions (that is, the frequency distributions associated with normal curves) that they contain share a number of characteristics: 1) The normal curve is bilaterally symmetrical; its shape is identical to the left and right of the mean (the highest point on the curve). 2) As a consequence, the mode, the median and the mean of the normal curve are all equal. 3) The tails of the normal curve are asymptotic, which means that they approach but never quite meet the horizontal axis. As a consequence, any value can be placed under a normal curve—the tails stretch to infinity. 4) The total area under the normal curve is equal to 1. 5) Almost all cases fall within three standard deviations of the mean: 68.3 per cent fall within ±1 standard deviation of the mean, 95.4 per cent fall within ±2 standard deviations of the mean, and 99.7 per cent fall within ±3 standard deviations of the mean. (This is referred to as the 68–95–99.7 rule, the empirical rule, or the three-sigma rule; the last title reflects the fact that the sigma is the symbol used to denote standard deviation.)

2. Explain how researchers choose a confidence level for their research.

The answer to which confidence level to adopt is by no means clear-cut and depends, in part, on the quality of the data and the consequences of making a mistake. When researchers work in an experimental setting with precise and tightly controlled measurements of the dependent and independent variables, they are prone to adopt a more stringent test. They want compelling evidence before rejecting the null hypothesis and may therefore employ the 99 per cent confidence level or even a 99.9 per cent level, refusing in the latter case to reject the null hypothesis unless the chances of being wrong are less than one in a thousand. However, political scientists often work with less controlled data and confront greater measurement noise and subject variability, particularly in survey research. As a consequence, they often adopt a less rigorous test and reject the null hypothesis at the 95 per cent confidence level. As we suggested, sample size also influences the confidence level chosen: researchers with large samples generally employ more stringent significance tests than do researchers with smaller samples. Finally, exploratory studies in which the researcher is looking for suggestive findings rather than ironclad results may employ lower levels of confidence, perhaps even the 90 per cent confidence level.

3. Explain how statistical significance helps with hypothesis testing.

The answer to which confidence level to adopt is by no means clear-cut and depends, in part, on the quality of the data and the consequences of making a mistake. When researchers work in an experimental setting with precise and tightly controlled measurements of the dependent and independent variables, they are prone to adopt a more stringent test. They want compelling evidence before rejecting the null hypothesis and may therefore employ the 99 per cent confidence level or even a 99.9 per cent level, refusing in the latter case to reject the null hypothesis unless the chances of being wrong are less than one in a thousand. However, political scientists often work with less controlled data and confront greater measurement noise and subject variability, particularly in survey research. As a consequence, they often adopt a less rigorous test and reject the null hypothesis at the 95 per cent confidence level. As we suggested, sample size also influences the confidence level chosen: researchers with large samples generally employ more stringent significance tests than do researchers with smaller samples. Finally, exploratory studies in which the researcher is looking for suggestive findings rather than ironclad results may employ lower levels of confidence, perhaps even the 90 per cent confidence level.

4. Describe the difference between Type I and Type II errors in inferential statistics.

When making inferences about the population based on relationships observed in sample data, we run the risk of making two types of errors—by minimizing the probability of one, we increase the probability of the other. Type I error is when we infer that a relationship found in the sample exists in the population when in fact it does not (a false positive). Type II error is when we do not find a relationship within the sample data and infer that there is not a relationship within the population when in fact there is (a false negative).

5. Explain some of the assumptions of multiple regression analysis.

One of the key assumptions of multiple regression is that the independent variables are independent of one another. When this assumption is violated, the analysis suffers from multicollinearity and the coefficients become less robust (reliable). A second assumption is that the regression line exhibits a constant error across the values of the independent variables. When this assumption holds, the error is referred to as homoskedastic; when the assumption is violated, the error is called heteroskedastic. Once again, heteroskedasticity produces less robust regression coefficients.