# How To Perform A Pearson Correlation Test In R

In this tutorial, I will show you how to perform a Pearson correlation test in R.

## What is a Pearson correlation test?

A Pearson correlation test is a parametric, statistical test to determine the linear correlation between two variables.

## Example data

For this tutorial, I will use the trees dataset that is already available within R.

The trees dataset contains measurements from 31 cherry trees. Specifically, the data frame contains three variables:

• Girth – the tree diameter (in inches)
• Height – the tree height (in feet)
• Volume – the volume of timber (in cubic feet)

In this example, I am interested in the correlation between tree girth and height.

## Example hypothesis

Based on the above example data, here are my two hypotheses:

• Null hypothesis – There is no significant correlation between tree girth and height
• Alternative hypothesis – There is a significant correlation between tree girth and height

I will also set my alpha level to 0.05.

## How to perform a Pearson correlation test in R

It is really easy to perform a Pearson correlation test in R. There are no additional package requirements; the correlation function is part of the standard R platform.

If you’re interested in learning more about performing correlations in R, then check out DataCamp’s interactive Correlation and Regression in R online course.

### Step 1: Import your data into R

The first step to perform a Pearson correlation in R is that you need some data containing the two variables of interest.

In this example, I will be using the trees dataset in R.

To load the trees dataset, simply run the following code.

```#Load the trees dataset
data(trees)```

You should now see the tree dataset in the environment.

### Step 2: Perform the Pearson correlation test

To perform the Pearson correlation test, use the cor.test function.

By default, the cor.test function performs a two-sided Pearson correlation test.

The cor.test function requires two inputs: x and y. These are the two variables that you want to correlate in the Pearson correlation.

The code to run the Pearson correlation in R is displayed below. Simply replace x and y with the names of the two variables.

```#Run the Pearson correlation test
##Replace x and y with the two variables
cor.test(x, y,
method = "pearson")```

By using my example, I am interested in the correlation between the girth and height variables in the trees dataset. So, my code will look like the following.

```#Pearson correlation test using the trees dataset
cor.test(trees\$Girth, trees\$Height,
method = "pearson")```

There are some additional arguments that you can change in the cor.test function. Some of the main ones you may be interested in are defined below.

• conf.level – change the confidence level (default is 0.95)
• Numeric value between 0 and 1

For example, if you want to run a Pearson correlation test with a confidence level of 0.90, then enter the following.

```#Pearson correlation test with 0.90 confidence level
cor.test(x, y,
method = "pearson",
conf.level = 0.90)```
• alternative – change the alternative hypothesis (default is “two.sided”)
• “two.sided” – non-zero
• “greater” – greater than zero (ie, positive correlation)
• “less” – less than zero (ie, negative correlation)

For example, if you wanted to run a one-sided Pearson correlation test with the alternative hypothesis describing a positive association, then enter the following.

```#One-sided (positive association) Pearson correlation test
cor.test(x, y,
method = "pearson",
alternative = "greater")```

## Interpretation of results

The output of my example is displayed below.

```	Pearson's product-moment correlation
data:  trees\$Girth and trees\$Height
t = 3.2722, df = 29, p-value = 0.002758
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.2021327 0.7378538
sample estimates:
cor
0.5192801 ```

There are a few parameters returned in the results of the Pearson correlation test. These are summarized below.

• data – the two variables in the test
• t – the t-statistic
• df – the degrees of freedom
• p-value – the p-value for the Pearson correlation test
• alternative hypothesis – a description of the alternative hypothesis
• 95 percent confidence interval – the 95% confidence intervals
• sample estimates – the Pearson correlation coefficient

So, by looking at my example output, the Pearson correlation coefficient is 0.52.

The Pearson correlation coefficient is a value that ranges from -1 to 1. The major cut-offs are:

• -1 – a perfectly negative association between the two variables
• 0 – no association between the two variables
• 1 – a perfectly positive association between the two variables

Since the coefficient value is positive, this means that there is a positive correlation between the variables girth and height. In other words, as the girth increases, so does the height.

You can also see that the p-value is 0.002758.

Since this p-value is below my alpha level (0.05), I will reject the null hypothesis and accept the alternative hypothesis. In other words, there is a significant (positive) correlation between the girth and height of the cherry trees.

## Wrapping up

I have shown you how to perform a Pearson correlation test in R. This can easily be achieved with the cor.test function; no other packages are required.

R version used: 3.6.3
R Studio version used: 1.2.5033