What is a Pearson correlation?
A Pearson correlation, also known as a Pearson Product-Moment Correlation, is a measure of the strength for an association between two linear quantitative measures.
For example, you can use a Pearson correlation to determine if there is a significance association between the age and total cholesterol levels within a population. This is the example I will use for this guide.
How to perform a Pearson correlation in SPSS
To perform a Pearson correlation in SPSS, you first need two variables of continuous data. Note, a Pearson correlation test is a parametric test. In other words, your data has to be normally distributed for the test to be valid. If you are unsure if your data is normally distributed, have a look at the normality testing in SPSS guide.
I have create a simple dataset containing 10 rows of data, each row signifies one person. I have two variables, the first being age (in years) and the other being blood total cholesterol levels (in mmol/L).
“There is no correlation between participant ages and blood total cholesterol levels.”
On the other hand, the alternative hypothesis would read:
“There is a correlation between participant ages and blood total cholesterol levels.”
Performing the test
- Within SPSS, go to ‘Analyze > Correlate > Bivariate‘.
Drag both variables from the left window, to the right window called ‘Variables‘. In this case, both ‘Age‘ and ‘Cholesterol‘ will be moved across. Note, that you can drag more than two variables into the test, with each combination possible being tested for at the same time.
2. Ensure that ‘Pearson‘ is ticked under the title ‘Correlation Coefficients‘. Since we have not made any prior assumptions, we will also leave the ‘Test of Significance‘ as ‘Two-tailed‘.
By going to the SPSS Output window, there will be a new heading of ‘Correlations’ with a correlation grid displayed.
- ‘Pearson Correlation‘ – This is the Person Correlation Coefficient (r) value. These values range from 0 to 1 (for positive correlations) and -1 to 0 (for negative correlations). The larger the number, the stronger the linear association between the two variables i.e. a value of ‘1’ indicates a strong positive association and a value of ‘-1‘ indicates a strong negative association. A value of ‘0‘ indicates no such association.
- ‘Sig. (2-tailed)‘ – The P value for a two-tailed analysis.
- ‘N‘ – The number of pairs of data in the analysis.
By looking at the results in the above table, it can be seen that the correlation between age and blood cholesterol levels gave a Pearson Correlation Coefficient (r) value of ‘0.882‘, which indicates a stong positive association between the two variables. Also, the P value of the association was ‘0.001‘, thus indicating a highly significant result. Therefore, I will reject the null hypothesis.
When reporting the results of a Pearson Correlation, it is useful to quote two pieces of data: the r value (the correlation coefficient) and the P value of the test. For the example above tis could read:
There was a strong positive association between participant ages and blood cholesterol levels (r = 0.882, P = 0.001).
IBM SPSS version used: 23