Step 1: Download the practice dataset (Data for regression example) from the OUP website

The dataset that was used for the running example in the chapter (for the professor who wanted to understand which independent variables influence students’ test grades) is online as a .sav file (it has the exciting name “Data for regression example”). Open SPSS and then download and open this file (DataforRegressionExampleSPSS). Once opened, you should see a small dataset with 30 observations and 4 variables (Grade on Test, Hours Studied, Interest in Political Science, and Attendance).

 

Step 2: Running single variable regression equations

We’ll begin by following the example in Chapter 9. The professor would like to know how study time (measured in hours) influences students’ grades on the test. To perform a single variable regression in SPSS, from the top menu choose Analyze, then Regression, then Linear as seen below:

This will open the Linear Regression window. From here, place the dependent variable (Grade on Test) in the “Dependent” box and the independent variable (Hours Studied) in the “Independent(s)” box.

Next, click Statistics (top right in the Linear Regression box) and make sure “Confidence intervals” and “Descriptives” are checked. (SPSS does not automatically check these two boxes off).  

Click Continue from the Linear Regression: Statistics window, and then OK from the Linear Regression window. This will produce a series of tables. The most important for our purposes are “Descriptive Statistics,” “Model Summary,” and “Coefficients,” which are reproduced below (the other tables were deleted for the presentation of this information):

These tables provide crucial information about the regression model. First, notice the constant in the Coefficients table (in the last table, the first value in the first row under the unstandardized beta coefficient). It is 63.2. This means that studying 0 hours is associated with a predicted grade of 63 points.

Now consider the 3 S’s (sign, size, and significance) of the coefficient for hours studied. The coefficient itself is found in the Coefficients table under the constant: here we see that the coefficient for hours studied is +2.023. The sign is positive (meaning that as hour studied increases, grades also increase) and the size is a little over two points, meaning that for each additional hour studied, a student’s grade is expected to increase by two points. To determine if this coefficient is significant, look across the last line of the Coefficients table to find the “Sig” (significance) of the coefficient. Here we see that the significance (p value) is .000. Since this value is below the threshold of .05, the null hypothesis that the true coefficient in the population from which this sample was taken can be rejected.

There are other important pieces of information in these tables as well. First, when you create your table, you will need the standard error for each coefficient, which is located next to the independent variable’s coefficient value. In this example, for hours studied, we see that the standard error is .252. Next is the number of observations included in the analysis. This information is in the first table, Descriptive Statistics, where we see that 30 students were represented in this regression. The next important piece of information is the Adjusted R Square, which is in the Model Summary table (it is the third value, .686). See the chapter for what these statistics represent. You will need to report these specific numbers when you create your regression table. (You can also find the standard error of the estimate in the Model Summary table and the confidence interval for the Hours Studied coefficient at the end of the Coefficients table, but these statistics are not usually reported when researchers create their new regression tables for their papers. See the Paper Progress section of Chapter 9 to see which pieces of information you will need for the new regression table you will create based on the statistics in these tables.)

Now the professor wants to switch independent variables and include Interest in Political Science as the independent variable. To do this in SPSS, from the top menu, choose Analyze, then Regression, then Linear. The previous test will auto-populate. Now use the little blue arrow (the second from the top) to remove Hours Studied (click on Hours Studied first) from the Independent(s) box. Then click on Interest in Subject and use that same blue arrow to place it in the Independent(s) variable box. Since you previously selected “Confidence Intervals” and “Descriptives” from the Statistics button for the first model (in which Hours Studied was the independent variable), you will not need to do this again.

Once you add the Interest in Political Science variable to the Independent Variable box, click OK. This will produce the following results (again, note that some of the tables have been removed).

For the last variable in the professor’s example (attendance), you simply repeat the steps from above to remove Interest in Political Science and include Attendance. From Analyze, choose Regression, then Linear. Replace Interest in Political Science with Attendance as seen below:

Again, since you already checked “Confidence Intervals” and “Descriptives” in the Statistics menu (they will remain checked during your session, unless you close SPSS and need to reopen it), from here you can click OK, which produces the following results:

 

Now, to perform a regression equation with several independent variables at the same time, simply add the additional independent variables to the Independent(s) box. From the top menu, select Analyze, then Regression, then Linear. Now add all three independent variables into the Independent variables box using the second little blue arrow as seen below:

After you click OK, SPSS will produce the following tables:

See Chapter 9 for a description of what these statistics mean.

 

Now that we have worked with the professor’s example, let’s turn to the DatapracSPSS. You can perform the same steps to generate statistics for linear regression equations using the variables in the DatapracSPSS. Let’s work with the example from the text, using education (as a binary variable) and the number of children (DP8) as independent variables to explain differences in democratic satisfaction (DP71), the dependent variable.

The steps are the same as before: From the top menu, choose Analyze, then Regression, then Linear. Place democratic satisfaction in the dependent variable box. Then, to run a single variable regression model, place the education binary variable (you would need to transform the education variable into a binary since the original education variable in the DatapracSPSS contains eight categories – the values of 1, 2, and 3 were combined into one category, while the values between 4 and 8 were combined into a second category) in the independent variable box. Be sure that “Descriptives’ and ‘Confidence Intervals” are chosen in the Statistics tab. Then for the second single variable regression model replace the education binary with the variable for the number of children (DP9) in the Independent(s) box. And finally, select both the education binary and number of children for the Independent(s) box to get the results for a multivariate regression model. Follow the explanation in the chapter to interpret the statistics properly.

For the single variable regression (education binary variable as the single independent variable and democratic satisfaction as the dependent variable).

 

Be sure ‘Descriptives’ and ‘Confidence Intervals’ are selected from the Statistics tab (found at the top right of the window). You will need to select these every time you open SPSS.

To run the second regression equation, simply substitute children for the education binary.

And finally, to run the last regression equation with both independent variables, add the education binary to the children variable so that both are included in the ‘independent(s)’ box:

Back to top