Chapter 13 Multiple choice questions

Quiz Content

not completed
. Download the data file for the multiple-choice questions for Chapter 13, and open it in R. Conduct k-means clustering with 3 clusters on the data stored in clusterdata. How many observations are in the largest cluster?

not completed
. What percentage of the variance is explained by the three-cluster solution?

not completed
. Try running clustering solutions with 2, 4, 5 and 6 clusters. Which one explains the greatest proportion of the variance?

not completed
. Consider the following cluster solutions, with between 2 and 6 clusters. The values are residual sums of squares:
Clusters (k): 2   3   4   5   6
RSS: 138   120   99   86   85
Calculate the AIC score for each solution using the expression:
AIC <- 2*k + n*log(RSS)
Where k is the number of clusters, and n is the sample size of 20 data points.
Which solution has the lowest AIC score?

not completed
. Calculate the distance matrix for the data in clusterdata. What is the largest distance between any two points?

not completed
. The dataset swiss is part of the datasets package and contains demographic information for 47 Swiss provinces. Load it into the environment with data(swiss). The fifth column contains information about the percentage of Catholics in each province. How many provinces had more than 50% Catholic citizens?

not completed
. Calculate a distance matrix on a subset of the swiss data that excludes column 5. What is the smallest distance in the matrix?

not completed
. Conduct metric multidimensional scaling on the distance matrix you calculated in Question 7, to reduce it to two dimensions. Plot the data, representing the data for majority Catholic provinces in a different colour from majority non-Catholic provinces. Which of the following best describes the results?

not completed
. Now conduct k-means clustering on the subset of Swiss data excluding the religion column, to find two clusters. Plot the scaled data again, but this time colour code according to cluster. What do you notice?

not completed
. Use the isoMDS function in the MASS package to conduct non-metric multidimensional scaling on the same data. Plot the results. How do they compare to the metric scaling?

Back to top