Chapter 7 SPSS tutorial

Step 1: Acquire SPSS

You will first need to acquire SPSS. Any version above SPSS 24 will work. Your institution may have a way for you to download SPSS directly onto your computer, have a virtual desktop through which you can access SPSS from a server on your campus, or include SPSS on public computers at your institution. You can also purchase temporary access to SPSS through an online student discount store, like On the Hub (onthehub.com). If you do purchase temporary access of SPSS, be sure to select ‘Grad Pack Standard,’ which costs a little more because you will need the regression functions included in the Grad Pack Standard.

Step 2: Download and open Dataprac (DatapracSPSS)

The Dataprac data is currently included on the OUP compendium website for this book as a .sav file that SPSS can read (it is called DatapracSPSS). Be sure to use this dataset moving forward. Once you have acquired SPSS, you will be able to open the DatapracSPSS dataset. First, go to the OUP website and save the DatapracSPSS dataset on your computer. Then, once SPSS is open, use the folder icon at the top of the screen to select the DatapracSPSS.sav file.

From here, simply click Open and after a moment, the entire dataset should open:

This is the DatapracSPSS dataset that SPSS can now read.

Step 3: Transform your nominal-level variable with more than two categories into a binary variable

If you are using a nominal-level independent variable with more than two categories, you will need to transform it into a binary variable so it can be used in your analysis. We will use DP9 Social Status as an example. There are five categories associated with the variable: 1 working class, 2 lower class, 3 lower middle class, 4 upper middle class, 5 upper class. Let’s transform this into a binary variable by combining the first, second, and third categories (working class, lower class, and lower middle class) into one new category that we’ll call ‘lower classes’ and the fourth and fifth categories (upper middle class and upper class) into a second category that we’ll call ‘upper classes.’ Thus, we need to tell SPSS to combine the values of 1, 2, and 3 for a new category (0 = lower classes) and the values of 4 and 5 into a new category (1 = upper classes).

To do this, select Transform from the top menu and then select Recode into Different Variables

This will open a new window (Recode into Different Variables)

From here, select DP9 and use the little blue arrow to place it in the middle box. Give the new variable a name; choose something that will help you remember what the variable is. For this variable, I chose ‘socialstatusbinary’ (note that you cannot insert spaces in between the words).

Now click on Old and New Values, and a new menu will open. You will need to put in each individual code in the “Old Value” bar and then the new value you want the variable to have for this code in the “New Value” bar. For our example, remember that we want to set 1, 2, and 3 as 0 (lower classes) and 4 and 5 as 1 (upper classes):

Old value 1 becomes new value 0, then click Add

Old value 2 becomes new value 0, then click Add

Old value 3 becomes new value 0, then click Add

Old value 4 becomes new value 1, then click Add

Old value 5 becomes new value 1, then click Add

Once this process is complete for all five values, click Continue. This will take you back to the Recode into Different Variables window. From here click Change under Output Variable, and then click OK at the bottom of the page.

Once this is complete, you will see the output window with the manipulation. Look carefully to ensure you transformed the variable properly. You should see the line: (1=0) (2=0) (3=0) (4=1) (5=1) INTO socialstatusbinary. The first value is the original code and the second number is the new code.

Now, if you return to the dataset, in the Variable view you will be able to find your new variable at the bottom of the variable list. In the picture below, see that socialstatusbinary is now listed as the last variable, under DP72.

You can perform this manipulation for any variable in the dataset. You can transform any variable – nominal level or ordinal level – into a binary. You just need the original codes so you can tell SPSS how you want the new variable to be. It is common, for example, for researchers to transform scaled variables into binary variables. For example, you could transform any of the confidence variables into binary variables by combining 1 (a great deal) and 2 (quite a lot) into a ‘confident’ category and by combining 3 (not very much) and 4 (none at all) into a ‘not confident’ category. The key is to ensure that the new variable is a valid reflection of the concept you wish to convey in the variable. By keeping the values or 1 and 2 separate from 3 and 4, you are likely creating a new variable that is still valid because the categories ‘a great deal’ and ‘quite a lot’ still convey greater levels of confidence, while the categories ‘not very much’ and ‘none at all’ convey lower levels of confidence. Below is the Recode into Different Variables screen for transforming Confidence in the Press (DP40) into a binary (‘a great deal’ and ‘quite a lot’ are combined into the 1 category (confident) and ‘not very much’ and ‘none at all’ are combined into the 0 category (not confident)).

Step 4: Limit the dataset to include only observations included in your research population

Before you can conduct your data analysis, you should limit the DatapracSPSS to include only observations that are included in your research population. Remember that the unit of analysis is ‘individuals’ or ‘people’ in the DatapracSPSS but the research population is whatever characteristic that unites all the observations. In the example below, we will limit the DatapracSPSS to include only respondents that are between the ages of 18 and 35.

To begin, select Data from the top menu and then select Select Cases (the second from the bottom). This will open the Select Cases window.

From here, click on If condition is satisfied, which will open the Select Cases: If window.

When in the Select Cases: If window, use the little blue arrow to put DP2 in the top box. (DP2 is age). Now you have to tell SPSS the new parameter for selecting cases. Since we want to include only observations that are between 18 and 35 years of age, use the blue calculator keys to enter <= 35 so that the new parameter is DP2<=35. Once you type this in, click Continue, and then from the Select Cases window click OK.

To ensure you did this correctly, you should now see in the Output window the command COMPUTE filter $=(DP2<=35). This tells you that for DP2 (age) only observations with a value of 35 or below will be included. You can also check this by going to the Data View of your dataset. Now you will see that many observations have slashes through them. The observations with the slash marks are the ones that will be excluded moving forward, i.e. people who aged 36 and above.

You can use the Select cases function to limit the dataset to any value or range of values for any variable in the dataset. For example, if you wanted the dataset to include only women, you would select cases based on DP1 (biological sex) and tell SPSS through the If cases are satisfied command to limit the dataset to all observations with a value of 2 for DP1 (from the codebook we know that 2 represents respondents who identify as female for DP 1).

Step 5: Create a frequency distribution for nominal level variables (including binary variables) or ordinal level variables with fewer than five categories.

Now that you have your binary variable and that you have selected to include only cases that represent your research population in the dataset, you can create a frequency distribution for your nominal level variables or for ordinal level variables with fewer than five categories. Remember that a frequency distribution lists the percentages for each code contained in a variable, relative to the total number of observations in the dataset. What we want to know is what percentage of people are one code or another. For the example below, we will return to the binary we created earlier, social status. Remember that the original codes of 1 (working class), 2 (lower class), and 3 (lower middle class) were combined into a single category (0 = lower classes) and that 4 (upper middle class) and 5 (upper class) were combined into a second category (1 = upper classes). We now want to know what percentage of respondents are in the 0 category and what percentage of respondents are in the 1 category. Below we used younger Americans (ages 35 and younger) as the research population so be sure to limit the dataset to that population if you want to reproduce the statistics below.

To do this in SPSS, from the top menu choose Analyze, then Descriptive Statistics, and then Frequencies (which will be the first choice).

This will open the Frequencies window. From here, use the little blue arrow to put the socialstatusbinary variable into the Variable box and then Click OK. (Remember that the new variable will be at the bottom of the variables list.)

This will produce a frequency distribution for the binary social status variable. Here we see that 58.6% of the respondents placed themselves in the lower classes categories (original codes of 1, 2, or 3) and that 41.4% of the respondents placed themselves in the upper classes category (original codes 4 and 5). Also, notice that the number of valid observations is 374. This number is lower than the original total of 1,031 in the Dataprac because we limited the dataset to include only people aged 18 to 35.

You would create a frequency distribution for any variable that is measured at the nominal level. However, if your ordinal level independent variable has fewer than five categories, you would still generate a frequency distribution for it for your descriptive statistics. For example, let’s consider DP18, Close to: The town or city where you live. This variable is measured on an ordinal scale and has four categories associated with it: (1 = very close, 2 = close, 3 = not very close, and 4 = not close at all). Since this variable is associated with only four possible responses, you should create a frequency distribution for it. Follow the same steps as before to do this (using the millennial generation as the research population). From the top menu choose Analyze, then Descriptive Statistics, and then Frequencies. Place DP18 in the box using the little blue arrow (you could also remove any other variables that you had already placed in the Variable box using the same blue arrow) and then click ok.

From this table we see that 34.2% of respondents answered with a 1 (very close), 35.3% with a 2 (close), 19.5% with a 3 (not very close), and 11% with a 4 (not close at all). Also, note that the cumulative percentage is very helpful here; clearly most people (69.5%) feel either very close or close to the town or city where they live.

Step 6: Generate summary statistics for variables with ordinal measurement when the number of categories is high (greater than or equal to five) and for variables with interval or ratio measurement

For variables with ordinal measurement with a high number of categories (like your dependent variable, which is measured on an ordinal scale from 1 to 10), you will need to generate summary statistics that include the variable’s minimum value, maximum value, mean, and standard deviation. As an example, we will examine DP71 Democratic Satisfaction. To generate summary statistics in SPSS, from the top menu choose Analyze, then Descriptive Statistics and then Descriptives (it will be the second choice). (Keep in mind that this example still limits the dataset to respondents aged 35 and below.)

This will open the Descriptives window. From here use the little blue arrow to place DP71 into the Variable box and then click OK.

This will produce a very simple table with the information you will need for your descriptive statistics. Notice that the minimum value is 1, the maximum value is 10, the mean is 5.09 and the standard deviation is 2.727.

You should generate descriptive statistics for any ordinal-level variable for which the number of categories is greater than or equal to five. As a final example, let’s generate summary statistics for DP63, Justifiable: Death Penalty. The variable is measured on an ordinal scale from 1 (never justifiable) to 10 (always justifiable). To generate summary statistics for this variable in SPSS, from the top menu, choose Analyze, then Descriptive Statistics, then Descriptives. Use the little blue arrow to place DP63 in the Variable box.

This produces the following results. We see that the minimum value is 1, the maximum value is 10, the mean is 5.43 and the standard deviation is 2.84.