Pearson Product-Moment Correlation
The Pearson product-moment correlation coefficient (or Pearson correlation coefficient for short) is a measure of the strength of a linear association between two variables and is denoted by r. Basically, a Pearson product-moment correlation attempts to draw a line of best fit through the data of two variables, and the Pearson correlation coefficient, r, indicates how far away all these data points are to this line of best fit (how well the data points fit this new model/line of best fit).
Correlation analysis is used to describe the strength and direction of the linear relationship between two variables. There are a number of different statistics available from SPSS, depending on the level of measurement and the nature of your data. The procedure for obtaining and interpreting a Pearson product-moment correlation coefficient (r) is presented along with Spearman Rank Order Correlation (rho). Pearson r is designed for interval level (continuous) variables. It can also be used if you have one continuous variable (e.g. scores on a measure of self-esteem) and one dichotomous variable (e.g. sex: M/F). Spearman rho is designed for use with ordinal level or ranked data and is particularly useful when your data does not meet the criteria for Pearson correlation. SPSS will calculate two types of correlation for you. First, it will give you a simple bivariate correlation (which just means between two variables), also known as zeroorder correlation. SPSS will also allow you to explore the relationship between two variables while controlling for another variable. This is known as partial correlation.
The procedure to obtain a bivariate Pearson r and non-parametric Spearman rho is presented. Pearson correlation coefficients (r) can only take on values from –1 to +1. The sign out the front indicates whether there is a positive correlation (as one variable increases, so too does the other) or a negative correlation (as one variable increases, the other decreases). The size of the absolute value (ignoring the sign) provides an indication of the strength of the relationship. A perfect correlation of 1 or –1 indicates that the value of one variable can be determined exactly by knowing the value on the other variable. A scatterplot of this relationship would show a straight line. On the other hand, a correlation of 0 indicates no relationship between the two variables. Knowing the value on one of the variables provides no assistance in predicting the value on the second variable. A scatterplot would show a circle of points, with no pattern evident.
Example of research question: Is there a relationship between the amount of control
people have over their internal states and their levels of perceived stress? Do people
with high levels of perceived control experience lower levels of perceived stress?
What you need: Two variables: both continuous, or one continuous and the other
dichotomous (two values).
What it does: Correlation describes the relationship between two continuous
variables, in terms of both the strength of the relationship and the direction.
Non-parametric alternative: Spearman Rank Order Correlation (rho).
Procedure for requesting Pearson r or Spearman rho
- From the menu at the top of the screen, click on Analyze, then select Correlate, then Bivariate.
- Select your two variables and move them into the box marked Variables (e.g. Total perceived stress: tpstress, Total PCOISS: tpcoiss). If you wish you can list a whole range of variables here, not just two. In the resulting matrix, the correlation between all possible pairs of variables will be listed. This can be quite large if you list more than just a few variables.
- In the Correlation Coefficients section, the Pearson box is the default option. If you wish to request the Spearman rho (the non-parametric alternative), tick this box instead (or as well).
- Click on the Options button. For Missing Values, click on the Exclude cases pairwise box. Under Options, you can also obtain means and standard deviations if you wish.
- Click on Continue and then on OK
Procedure for generating a scatterplot
- From the menu at the top of the screen, click on Graphs, then select Legacy Dialogs.
- Click on Scatter/Plot and then Simple Scatter. Click Define.
- Click on the first variable and move it into the Y-axis box (this will run
vertically). By convention, the dependent variable is usually placed along
the Y-axis (e.g. Total perceived stress: tpstress).
- Click on the second variable and move to the X-axis box (this will run
across the page). This is usually the independent variable (e.g. Total
- In the Label Cases by: box, you can put your ID variable so that outlying
points can be identified.
- Click on OK
Source: Pallant (2011)