In this tutorial, I will show you how to perform a Spearman rank correlation test in R.
What is a Spearman correlation test?
A Spearman’s rank correlation test is a non-parametric, statistical test to determine the monotonic association between two variables.
For this tutorial, I will use the mtcars dataset that is already available within R.
The mtcars dataset contains measurements from 32 cars between the years 1973-1974. Specifically, the data frame contains 11 variables:
- mpg – miles per gallon
- cyl – the number of cylinders
- disp – displacement (in cubic inches)
- hp – gross horsepower
- drat – rear axle ratio
- wt – weight (per 1000 pounds)
- qsec – time to do a quarter mile (in seconds)
- vs – engine (0 = V-shaped; 1 = straight)
- am – transmission (0 = automatic; 1 = manual)
- gear – the number of forward gears
- carb – the number of carburettors
In this example, I am interested in the relationship between the cars miles per gallon (MPG) and horsepower.
Based on the above example data, here are my two hypotheses:
- Null hypothesis – There is no significant correlation between the MPG and horsepower of the cars
- The alternative hypothesis – There is a significant correlation between the MPG and horsepower of the cars
I will also set my alpha level to 0.05.
How to perform a Spearman correlation test in R
Just like performing a Pearson correlation test in R, it’s also very easy to perform a Spearman correlation test.
If you’re interested in learning more about performing correlations in R, then check out DataCamp’s interactive Correlation and Regression in R online course.
Step 1: Import your data into R
The first step to perform a Spearman correlation in R is that you need some data containing the two variables of interest.
In this example, I will be using the mtcars dataset in R.
To load the mtcars dataset, simply run the following code.
#Load the trees dataset data(mtcars)
You should now see the mtcars dataset in the environment.
Step 2: Perform the Spearman correlation test
To perform the Spearman correlation test, use the cor.test function.
The cor.test function requires two inputs: x and y. These are the two variables that you want to correlate in the Spearman correlation.
You also need to add in the argument method = “spearman” to ensure a Spearman test is performed.
The code to run the Spearman correlation in R is displayed below. Simply replace x and y with the names of the two variables.
#Run the Spearman correlation test ##Replace x and y with the two variables cor.test(x, y, method = "spearman")
By using my example, I am interested in the correlation between the MPG and horsepower variables in the trees dataset. So, my code will look like the following.
#Spearman correlation test using the mtcars dataset cor.test(mtcars$mpg, mtcars$hp, method = "spearman")
Additional settings of interest
You may also be interested in changing the null hypothesis of the Spearman correlation test.
To change the null hypothesis, you need to adjust the input of the alternative argument:
- “two.sided” – non-zero
- “greater” – greater than zero (ie, positive correlation)
- “less” – less than zero (ie, negative correlation)
The default is “two.sided”.
For example, if you wanted to run a one-sided Spearman correlation test with the alternative hypothesis describing a negative association, then enter the following.
#One-sided (negative association) Spearman correlation test cor.test(x, y, method = "spearman", alternative = "less")
Interpretation of results
The output of my example is displayed below.
Spearman's rank correlation rho data: mtcars$mpg and mtcars$hp S = 10337, p-value = 5.086e-12 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.8946646 Warning message: In cor.test.default(mtcars$mpg, mtcars$hp, method = "spearman") : Cannot compute exact p-value with ties
There are a few parameters returned in the results of the Spearman correlation test. These are summarized below.
- data – the two variables in the test
- S – the s-statistic
- p-value – the p-value for the Spearman correlation test
- alternative hypothesis – a description of the alternative hypothesis
- sample estimates – the Spearman correlation coefficient
Note, there is a warning message in the output window. This is because an exact p-value cannot be computed with ties.
A tie is when there is a variable containing more than one data point with the same value.
If you want to remove the warning message, simply add the exact = FALSE argument. I have added this to my example code below.
#Removing the warning message cor.test(mtcars$mpg, mtcars$hp, method = "spearman", exact = FALSE)
So, by looking at my example output, the Spearman correlation coefficient is -0.89.
The Spearman correlation coefficient is a value that ranges from -1 to 1. The major cut-offs are:
- -1 – a perfectly negative association between the two variables
- 0 – no association between the two variables
- 1 – a perfectly positive association between the two variables
Since the coefficient value is negative, this means that there is a negative correlation between the variables MPG and horsepower. In other words, as the MPG increases, the horsepower decreases.
The p-value is 0.000000000005086 (5.086 x 10-12).
Since this p-value is below my alpha level (0.05), I will reject the null hypothesis and accept the alternative hypothesis. In other words, there is a significant (negative) correlation between the MPG and horsepower of the cars.
I have shown you how to perform a Spearman correlation test in R. This can easily be achieved with the cor.test function; no other packages are required.
R version used: 3.6.3
R Studio version used: 1.2.5033