How To Perform A Non-Parametric Partial Correlation In SPSS

Partial correlations are great in that you can perform a correlation between two continuous variables whilst controlling for various confounders. However, the partial correlation option in SPSS is defaulted to performing a Pearson’s partial correlation which assumes normality of the two variables of interest.

But what if you want to perform a Spearman’s partial correlation on non-normally distributed data?

If you go to Analyze > Correlate > Partial … you will see that there is no option to select a Spearman correlation. There is, however, a way around this using a little coding.

In this guide, I will explain how to perform a non-parametric, partial correlation in SPSS.

The required dataset

To be able to conduct a Spearman partial correlation in SPSS, you need a dataset, of course. For our example, we have the age and weight of 20 volunteers, as well as gender. What we want to test is if there is a correlation between age and weight, after controlling for gender.

Creating the script

For this to work, you need to enter a small piece of script into the SPSS Syntax Editor. Open up the Syntax Editor by going to File > New > Syntax.

Next, copy and paste the following code:

NONPAR CORR
/MISSING = LISTWISE
/MATRIX OUT(*).
RECODE rowtype_ ('RHO'='CORR') .
PARTIAL CORR
/significance = twotail
/MISSING = LISTWISE
/MATRIX IN(*).

You now need to add the appropriate variables next to the NONPAR CORR and PARTIAL CORR sections.

So, next to the NONPAR CORR enter all of the variables that will be involved in the partial correlation. In our example, this would be Age, Weight and Gender.

For the PARTIAL CORR line you need to enter the two variables of interest in the correlation followed by a BY then the variables you want to control for. Make sure all of the variables you enter match the ones in your file correctly, otherwise the script will fail.

Here is what our example will look like:

NONPAR CORR Age Weight Gender
/MISSING = LISTWISE
/MATRIX OUT(*).
RECODE rowtype_ ('RHO'='CORR') .
PARTIAL CORR Age Weight BY Gender
/significance = twotail
/MISSING = LISTWISE
/MATRIX IN(*).

And here is what it looks like in the Syntax Editor:

Running the script

The script itself is separated into 3 parts: NONPAR CORR, RECODE and PARTIAL CORR.

The first is to perform a Spearman bivariate correlation for all variables and to add the Spearman rank correlation coefficients into a new file.

RECODE converts the row type from a Spearman (RHO) to a Pearson (CORR).

Finally, PARTIAL CORR performs the partial correlation on the desired variables by using the newly created Spearman correlation coefficients from the NONPAR CORR script.

Here is how to run the script:

1. To run the script, go to the Syntax Editor and with the NONPAR CORR section selected, hit the green play button.

This will give you an output for the Spearman’s rho between the variables. If you go to the SPSS Output file you will see:

You will also notice that a new SPSS data file has been created and is now open. This is usually named ‘Untitled’, or something similar. Within this file, you will see the Spearman’s rho values and n numbers for each correlation.

2. You next need to go back to the Syntax Editor window and run the RECODE part of the script. Make sure you select the new dataset as the active worksheet for this, as you want to perform the RECODE on the new sheet. You can toggle between datasets by clicking on the drop-down menu next to Active:. In this case, we select the Unnamed sheet:

Click the green play button again to run the RECODE script on this.

3. Finally, still in the Syntax window, select the PARTIAL CORR code and run this on the same Unnamed dataset. This will perform the final partial correlation.

The output

By looking in the output file, you should now see a Partial Corr box which contains the partial correlation coefficients and P values for the test:

You will see in this example that the non-parametric partial correlation for age with weight, after controlling for gender, has a coefficient value of ‘0.383’ and has a significant value of ‘0.105’.

Therefore, there is not a significant correlation for age and weight after accounting for gender.

43 COMMENTS

Emma March 25, 2019 At 3:19 pm

Hi Steven,

Thank you so much for this guide. One quick question: when I run it, my degrees of freedom are off, which impacts my p values. I think my degrees of freedom are being based on the variables in the correlation matrix, not the actual number of cases. For example, although I have a sample size of 100, my df when I run the syntax ends up being 26. Although I am controlling for 6 variables and am not sure exactly what the df should be, 26 doesn’t seem right to me. From your screenshots, it doesn’t seem like you had this problem. Do you have any thoughts as to why this would happen? Thanks so much again!

Emma

Reply
- Dr Steven Bradburn, PhD March 27, 2019 At 9:51 am
  
  Hi Emma,
  So sorry for the delay.
  So do you have 100 cases and all of these have matching data for the variables being controlled for, ie there are no missing data points?
  Thanks,
  Steven
  
  Reply
Juan Cristobal Maass January 31, 2019 At 4:27 am

Dear Steven, I just wonder how to cite this method?
Cheers
Juan

Reply
- Dr Steven Bradburn, PhD February 1, 2019 At 8:07 am
  
  Hi Juan,
  I based this guide on the one produced by IBM. In that, they quote a reference:
  Conover, W.J. (1999), “Practical Nonparametric Statistics (3rd Ed.). New York: Wiley, (p. 327-328).
  This may be a good place to start?
  I hope that helps,
  Best wishes,
  Steven
  
  Reply
Larry Lai September 24, 2018 At 8:51 am

Hi Steven

I’m using the SAS instead.

Obtained from: https://en.wikipedia.org/wiki/Partial_regression_plot

1) Computing the residuals of regressing the response variable against the independent variables but
omitting Xi
2) Computing the residuals from regressing Xi against the remaining independent variables
3) Plotting the residuals from (1) against the residuals from (2).

Example of SAS code: (I wish to acknowledge the contribution of Mr. Lin (Robbin@TMU), for his assistance in the longitudinal stats class)
—
SAS code:

proc import datafile=”C:\Users\User\Desktop\working.xls” out=ddd replace dbms=excel;
run;
proc print;run;quit;

*1. Computing the residuals of regressing the response variable against the independent variables but omitting Xi;
proc reg data=ddd;
model var1=ctrl1 ctrl2;
output out=out1 residual=r1;
run;
proc print data=out1;run;quit;

*2.Computing the residuals from regressing Xi against the remaining independent variables;
proc reg data=ddd;
model var2=crtl1 ctrl2;
output out=out2 residual=r2;
run;
proc print data=out2;run;quit;
*3.Plotting the residuals from (1) against the residuals from (2).;
data out1 ;set out1; _n+1;run;
data out2;set out2;_n+1;run;
data out3;merge out1 out2; by _n;run;
proc sgplot data=out3;
scatter x=r1 y=r2;
run;
—
Cheers
Larry

Reply
Larry Lai July 4, 2018 At 10:35 am

Hi Steven

This is the SPSS syntax for the non-parametric partial corr the syntax example from SPSS forum (https://developer.ibm.com/answers/questions/223269/plotting-a-partial-corr-using-pairwise-exclusion/).

The SPSS syntax as follows:
—
* Encoding: UTF-8.
NONPAR CORR var1 var2 ctrlvar1 ctrlvar2
/MISSING = LISTWISE
/MATRIX OUT(*).
RECODE rowtype_ (‘RHO’=’CORR’) .
PARTIAL CORR var1 var2 BY ctrlvar1 ctrlvar2
/significance = twotail
/MISSING = LISTWISE
/MATRIX IN(*).

REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT var1
/METHOD=ENTER ctrlvar1 ctrlvar2
/SAVE ZRESID.

REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT var2
/METHOD=ENTER ctrlvar1 ctrlvar2
/SAVE ZRESID.

GRAPH
/SCATTERPLOT(BIVAR)=RES_1 WITH RES_2
/MISSING=LISTWISE.
—
Please feel free to comment on this syntax. Much obliged.

Best wishes
Larry Lai

Reply
- Steven July 4, 2018 At 1:19 pm
  
  Hi Larry,
  Thanks for sharing. Did this work for you? The syntax looks like it is doing a regression, similar to how I described, and plotting the residuals this way.
  Best wishes,
  Steven
  
  Reply
Larry Lai July 4, 2018 At 6:11 am

Hi Steven

Thanks for sharing. By the way, how to plot a Non-Parametric Partial Correlation In SPSS?

Thanks for considering my request.

Cheers
Larry

Reply
- Steven July 4, 2018 At 8:26 am
  
  Hi Larry,
  Thanks for your comment!
  Plotting the results is something I have found quite difficult myself. But I am yet to find a conclusive answer.
  I have seen others which plot the results via a regression:
  What you can do in SPSS is plot these through a linear regression. Go to: Analyze -> Regression -> Linear Regression Put one of the variables of interest in the Dependent window and the other in the block below, along with any covariates you wish to control for. Then click the Plots button and tick the option for ‘Produce all partial plots’. Then run the test. One of the graphs produced will be the graph you are after. Hope that helps!
  Whether this is the correct way, however, I am not so sure – sorry.
  If you do find out, please come back and share!
  Best wishes,
  Steven
  
  Reply
Fereshteh June 4, 2018 At 12:32 pm

Hi Steven,
Thank you for this useful guide!
I worked with this syntax but I get this warning:
“The MATRIX subcommand on the PARTIAL CORR command specifies an input file which does not contain a correlation matrix for the current splitfile group. Within cell matrices are not acceptable. A correlation matrix has a row type of “CORR”.”

can you help me out ?
regards,
Ferehsteh

Reply
Ian March 26, 2018 At 3:59 pm

Unfortunately, one can not meaningfully apply the partial correlation formulas from parametric (usually Pearson’s) correlation to Spearman’s Rank correlation. You can apply the formulas as you have above, but the formulas were not developed for Spearman’s and the answers you get back are not meaningful partial correlations as they are with Pearson’s, so the Spearman’s the partial correlations are meaningless and can not be interpreted. This might not stop people doing it, but their resulting conclusions are fatally flawed.

However, you can use Kendall’s Tau correlation for nonparametric correlation, and apply the same parametric partial correlation formula to get meaningful answers. Be aware though that Kendall’s Tau has a different meaning to Pearson’s r in explaining the correlation relationship. Unfortunately, there’s no easy way to apply significance testing to partial correlations based upon Kendall’s Tau since the underlying sample distribution is not defined (as it is for Pearson’s).

So if you want partial correlations for nonparametric data, use Kendall’s Tau rather than Sprearman’s r.

Reply
Meredith March 15, 2018 At 4:11 pm

Hello! Thanks for this wonderful guide. Do you have any suggestions for how to plot the results of the nonparametric partial correlation on a graph? I cannot figure this out or find anything online.

Reply
Leah March 3, 2018 At 11:22 pm

Super helpful! Can you please tell me how to flag significant correlations on the output?

Reply
Ingrid January 12, 2018 At 5:32 pm

Thank you so much for a very helpful post and also helpful comments and replys above.
Is it possible to enter more than two variables at the same time (before BY)? Or do I have to repeat it for every depending variable I want to test?

Reply
- Steven January 15, 2018 At 9:49 am
  
  Hello Ingrid,
  Many thanks for your comments and kind words. I presume you can enter more than 2 variables before the ‘BY’. The results should then display a grid table so you can look at all your correlations within the same output.
  Let me know if there is an issue with this however.
  Thanks!
  Steven
  
  Reply
  - Ingrid January 15, 2018 At 1:19 pm
    
    Thank you for the reply. It worked and gave a grid table as you said. If you want to test the correlation between one (undepenent) variable and all the others (depentend) varaibles, you can place that one first or last and write WITH between in the partial recode.
    
    Example: PARTIAL CORR Age WITH Weight Pain BY Gender
    Then the output will show only Age as a horisontal collum and Weight and Pain in the vertical collums.
    
    Reply
    - Steven January 16, 2018 At 9:12 am
      
      Excellent, glad it worked for you. And thank you very much for the additional tip 🙂 greatly appreciated
      
      Reply
stats December 21, 2017 At 10:32 am

can i know why the partial coefficient value is higher than the spearman’s rho value? shouldn’t it be lower?

Reply
- Steven January 15, 2018 At 10:00 am
  
  Hello,
  The correlations can increase or decrease depending upon the relationship your covariates have on the variables you are interested in. There is a discussion on this on ResearchGate which may be useful to see:
  https://www.researchgate.net/post/Any_advice_on_partial_correlation_interpretation
  Hope that helps!
  Thanks
  Steven
  
  Reply
Terhi Tuokkola December 20, 2017 At 10:51 am

Hi Steven,

I am running your script, but having an error below. The problem is the new “unnamed” or “unknown” data sheet does not exist or I can’t find it. What to do?

Thanks,

Terhi

DATASET ACTIVATE DataSet1.
RECODE ROWTYPE_ (‘RHO’=’CORR’).

Error # 4631 in column 8. Text: ROWTYPE_
On the RECODE command, the list of variables to be recoded includes the name
of a nonexistent variable.
Execution of this command stops.

Reply
- Steven January 15, 2018 At 10:03 am
  
  Hello Terhi,
  So sorry for the late reply. Did you manage to sort this? It seems like the new results are not being opened in a new datasheet. Have you ensured the ‘/MATRIX OUT(*).’ part of the code is included before you run the RECODE part of the code.
  Thanks
  Steven
  
  Reply
Rose November 29, 2017 At 3:52 pm

Hi Steve,

Great post. Do you know how to compute 95% confidence intervals for Spearman’s partial correlations using the syntax? My reviewers are requesting confidence intervals for all point-estimates in accordance with APA.

Thanks,

Rose

Reply
- Steven November 30, 2017 At 11:27 am
  
  Hi Rose,
  
  Thanks for the comment. Unfortunately I do not know how to report 95% CI for this. Upon reading around this it seems quite a few people are asking the same thing. I have found a link to this website however, http://vassarstats.net/rho.html, which computes 95% CI from the r and n values. May be of use for you?
  
  Thanks,
  
  Steven
  
  Reply
- Ian March 29, 2018 At 3:14 pm
  
  Hi Rose:
  
  This reply may have come a little late for you. As I posted below, there is no such thing as partial correlations for Spearman’s rho. Therefore, compute Kendall’s Tau, where you can calculate meaningful partial correlations. However, even for Kendall’s Tau, there is no defined sampling distribution, and so CIs can not be calculated. You can move forward in a few ways: First carefully review your data to be sure that Pearson’s r can not be used. Pearson’s r is pretty robust, and unless your data are very skewed from normal, you might be able to proceed (don’t get distracted by the type of data you collecting, you can apply Pearson’s r even to categorical data). If the first approach does not work, try a data transformation to make your data sufficiently normal to apply Pearson’s r (Spearman’s itself is a kind of rank data transformation). Finally, if you end up using Kendall’s Tau you might be able to apply bootstrap methods to develop a sampling distribution to create CIs around the partial correlations. This is a last resort for most people, and I’ve rarely seen this done.
  
  Ian
  
  Reply
  - Steven April 3, 2018 At 8:08 am
    
    Thanks for the advice Ian, really appreciate it!
    Best wishes,
    Steven
    
    Reply
Marie Britt October 27, 2017 At 9:49 am

Hi, Steven
Thank you so much for providing this. I read the IBM instructions for syntax and was totally bamboozled; your tutorial and example was very easy to follow and was immensely helpful!
FYI, I am using SPSS V24 on a Mac. When I ran the second part of the syntax [RECODE rowtype_ (‘RHO’=’CORR’) .] I received a warning message. I removed the space between “rowtype_” and “(‘rho’=’corr’) and re-ran without any further problems [ie. RECODE rowtype_(‘RHO’=’CORR’) .].
regards
Marie

Reply
- Steven October 30, 2017 At 9:32 am
  
  Hi Marie,
  
  Thanks very much for the feedback, very much appreciated. Also, thanks for providing details for the Mac users. Unfortunately I am just on Windows at the minute so I cannot provide too much information on that system, but maybe in the future I can expand :).
  
  Best wishes,
  
  Steven
  
  Reply
- Rose November 29, 2017 At 3:54 pm
  
  Hi,
  I’m on a Mac also and I found the warning message disappeared if I ensured I had clicked at the top of the syntax (so that the procedure was run from the right place and not halfway down the command).
  
  Reply
  - Adrian January 6, 2019 At 2:49 pm
    
    Hi Rose,
    Thanks very much for this, I was having exactly the same problems as you (on a Mac) and found that clicking the top of the syntax sorted this. Your info sharing and advice has made a happy student!
    
    Reply
Omar September 19, 2017 At 1:24 am

Hi Steven,
This is really helpul, but can you control for more than one variable? E.g., 3 variables (1 continuous and 2 categorical).

Reply
- Steven September 19, 2017 At 8:32 am
  
  Hi Omar,
  
  Thanks for the feedback. Yes, you can control for more than one variable. However, the more variables you are controlling for the less reliable the test may become because you may over-fit your analysis. If you have a large enough sample size then it should be okay. One rule, called the One in Ten rule (https://en.wikipedia.org/wiki/One_in_ten_rule), is suggested for regression analysis and could be kept in mind when doing a partial correlation. Briefly, for every control (or predictor) variable you use there must be at least 10 samples in the analysis.
  
  Hope that helps!
  
  Reply
  - Omar September 19, 2017 At 3:12 pm
    
    Great! So, in this case I would need to do something like Age Weight BY Gender By SES BY Ethnicity, right?
    
    Reply
    - Steven September 19, 2017 At 3:24 pm
      
      When more than one control variable is entered then only one ‘BY’ is required. So:
      
      Age Weight BY Gender SES Ethnicity
      
      This will control for ‘Gender’, ‘SES’, and ‘Ethnicity’.
      Hope that helps 🙂
      
      Reply
Jess August 7, 2017 At 6:47 am

Also, are you sure partial correlations can be run by categorical variable? I thought it was only to control for a continuous variable.

Reply
- Steven August 10, 2017 At 8:48 am
  
  As far as I am aware, you can control for dichotomous variables (e.g. gender). However, I am no stats expert!
  
  Reply
Jess August 7, 2017 At 6:46 am

Hey Steven,

I am trying to run your syntax, but my output says:
“The input matrix file does not contain a ROWTYPE_ variable or the variable has been misspecified.”
Could you help me out?
Thank you!

Reply
- Steven August 10, 2017 At 8:44 am
  
  Hi Jess,
  
  Sorry for the late response. The error you are getting, when do you get this? Is this for the first (nonpar corr), second (recode) or third (partial corr) part of the script?
  
  Thanks!
  
  Reply
  - Lauren October 14, 2017 At 3:58 am
    
    Hi Steven,
    
    I am also experiencing this error. I get it at the second [RECODE rowtype_ (‘RHO’=’CORR’)] part of the script.
    
    The exact error message is as Jess stated, “The input matrix file does not contain a ROWTYPE_ variable or the variable has been misspecified.”
    
    Thanks in advance
    
    Reply
    - Steven October 16, 2017 At 8:22 am
      
      Hi Lauren,
      
      I think this error is because you may be running the RECODE part of the script using original datasheet. Have you changed the ‘Active’ sheet to the newly created ‘unnamed’ one before running the RECODE part? (See point 2 in the guide above).
      
      I am in the process of creating a screencast video that will hopefully help.
      
      Let me know if this works 🙂
      
      Thanks
      
      Reply
Suzanne July 25, 2017 At 8:57 am

This was exactly what I needed, thank you so much! I agree with Rachael that is really clearly described.

Reply
- Steven July 26, 2017 At 10:36 am
  
  Thank you very much Suzanne for the comment. I am glad it helped you out too 🙂
  
  Reply
Rachael June 26, 2017 At 4:13 pm

This has been so helpful. Thank you. Really clear and easy to follow.

Reply
- Steven July 3, 2017 At 7:19 am
  
  Thanks Rachael, I really appreciate your comment. I am glad it helped you out 🙂
  
  Reply