A question that I often come across for those who are calculating relative gene expression values in qPCR is, how to go about using this method if there is more than one reference (housekeeping) gene?

A video tutorial on qPCR data analysis with multiple reference genes can be found in our Mastering qPCR course

There are a few ways to work with multiple reference genes in this instance. One way is to select the single best gene from the numerous ones tested to be used as the reference. This can be done by using a variety of software which can determine the best reference gene to use, such as geNorm (available in Biogazelle’s qbase+ program) or Normfinder (a free Excel add-on).

Assuming the multiple reference genes in question work very well and are not affected by the experimental conditions, it is possible to use them all to determine the relative gene expression levels. This approach was described by Vandesompele and others in 2002 and Hellemans and colleagues in 2007, both published in Genome Biology, which I thoroughly recommend reading.

# The equation

The equation for using multiple reference genes to calculate the relative gene expression is displayed below.

The first thing I will say is: don’t panic! It is actually not as confusing as it looks. It is actually very similar to the Pfaffl equation, the only difference here being the geometric averaging of all the relative quantities (RQ), i.e. the (E_{REF})^{∆Ct REF} part, of the multiple reference genes used on the denominator (bottom) part of the equation.

The *E *in the equation refers to the base of exponential amplification. A value of 2, like in the delta-delta Ct method, indicates that after each PCR cycle, the amount of product will double. In other words, a value represents a 100% efficient reaction.

## How to use the equation

I will start with an example of a qPCR experiment, where I have Ct values for control and treated samples. I have performed qPCR using 2 reference genes (REF 1, REF 2) and my gene of interest (GOI). Each group has 3 biological repeats (1, 2 and 3). This could be a theoretical example of a cell culture experiment which has been repeated three times. Each qPCR was run in duplicate (technical repeats) and an average Ct value calculated, which are presented in the Ct column. The example data is presented below.

## 1. Calculate primer efficiencies

Like the Pfaffl method, the first thing that is required is to determine the primer efficiencies for your GOI and REF genes, in order to calculate the base of exponential amplification value. How to calculate primer efficiencies has been described in detail previously, so please refer to this post before continuing further.

Once you have the primer efficiencies, these will be in the format of a percentage, for example, 98%. However, this percentage is not entered directly into the equation, rather it needs to be converted.

A converted primer efficiency value of **2** indicates a 100% efficiency. This is the case when using the delta-delta Ct method. In other words, for every PCR cycle, the amount of DNA will multiply by **2**. On the other hand, an efficiency of 90% would give a base of exponential amplification value of **1.90** and an efficiency of 110% would give a value of **2.10**.

If you are still unsure, an easy way to convert the primer efficiency percentage is to divide the percentage by 100 and add 1.

For this example, I will pretend I have calculated the primer efficiency of my genes as follows:

- GOI =
**1.93**(93%) - REF 1 =
**2.01**(101%) - REF 2 =
**1.97**(97%)

## 2. Select a calibrator sample to determine delta Ct (∆Ct)

The next step is to decide which sample, or group of samples, to use as a calibrator when calculating the **∆Ct **values for all the samples. As mentioned previously, this is the part which confuses a lot of people.

A common way of doing this is to just match the experimental samples and determine the relative gene expression ratios separately. This is all well and true for experiments that have matched pairs, such as the case in cell culture experiments. However, this is difficult when the two experimental groups vary in n numbers and do not have matched pairs.

Another way is to select a sample with the highest or lowest GOI Ct value, reflecting the sample with the lowest or highest relative gene expression value respectively. This way, all the results will be relative to this sample.

I personally average the Ct values of the Control group biological replicates to create a **‘Control average Ct’**. By doing so would mean that the results are presented relative to the control average Ct values.

Whichever sample, or group of samples, you use as your calibrator is fine so long as this is consistent throughout the analyses and is reported in the results so it is clear. Remember, the results produced at the end are **relative** gene expression values.

With this in mind, we next need to average the Control group Ct values for each gene.

So, for REF 1 this will be the average of **17.18**, **16.96** and **17.11**, which works out as **17.08**. Repeating this for the REF 2 and GOI will give the following results.

## 3. Calculate delta Ct (∆Ct) values

Next, we need to calculate **∆Ct **for all the samples within the different genes. The equation for **∆Ct** can be found below.

To do this, simply subtract the sample Ct values from the calibrator Ct (in this example this will be the ‘**Control average Ct**‘ value).

So, to calculate the **∆Ct** for the sample ‘**Treated 1**‘ for REF 2, you need to do **20.89 – 21.10**, which equals **-0.21**. By repeating this for all the samples, for both genes, we get the results below.

## 4. Calculate relative quantity (RQ) values

The next step is to create RQ values for each sample, separately for each gene. The equation for calculating RQ is displayed below.

Where E in the equation refers to the base of exponential amplification (i.e. the efficiency of the reaction). Remember, these were calculated for each primer pair used in Step 1 above.

To show you one example, I will calculate the RQ for the Control 1 sample. For the REF 1 gene, I calculated the base of exponential amplification to be **2.01** (i.e. 101% efficiency). So the RQ in this sample will be **2.01 ^{-0.10}** which comes to 0.99. For the REF 2 gene, I calculated the base of exponential amplification to be

**1.97**(i.e. 97% efficiency). So the RQ in this sample will be

**1.97**which comes to

^{0.24}**1.18**. And for the GOI, I calculated the base of exponential amplification to be

**1.93**(i.e. 93% efficiency). So the RQ in this sample will be

**1.93**which comes to

^{0.08}**1.05**. I have extended the results to repeat this analysis for all of the samples.

## 5. Calculate the geometric mean of the reference genes RQ values

This next step is the part which takes into account multiple reference genes. Specifically, the geometric mean of the reference gene RQ values must be created for each sample used. To do this in Excel, use the ‘**=GEOMEAN**‘ function.

For example, for the ‘**Control 1**‘ sample, this will be the geometric mean of **0.94** and **1.18**. In Excel, the formula will be ‘**=GEOMEAN(0.94,1.18)**‘. If more reference genes were used in the experiment, then these RQ values can also be added on here too. The geometric mean of the aforementioned calculation gives **1.05**.

I have calculated the geometric means of the two reference genes in the example (‘**REF 1**‘ and ‘**REF 2**‘) for all the samples below.

## 6. Calculate relative gene expression values

Finally, we now have all of the components to be able to calculate relative gene expression values. To calculate the relative gene expression values, simply divide the RQ of the GOI by the geometric mean of the RQ values for the reference genes (i.e. that created in the previous step).

You will notice that this equation is the same one at the start of this article – just a simplified way of writing it.

Taking ‘**Treated sample 1**‘ as an example, the relative gene expression value will be **23.56** divided by **0.68**, which gives **34.81**.

May there be a mistake with the primer efficiency calculation?

if 100% efficiency for a pair of primers means the template would be increased by a factor of 2 than if you have 110% efficiency the amount of template would be increased by a factor of 2.2 and not 2.1.

Hi Uri,

Many thanks for your comment.

I understand your confusion. The amplification factor (E) of 2 represents a primer efficiency of 100%. To calculate the amplification factor, the equation of: 10^(-1/slope) is used. Where the slope is the slope of the line following the serial dilutions of a qPCR series. A slope of -3.1 gives an amplification factor of 2.1 and a primer efficiency of 110%.

I have just created a qPCR primer efficiency online calculator which does this for you. All you have to do is to enter the slope value.

I hope that helps!

Steven

Hi Steven,

How would this work if each group had 1 set of biological samples?

Hi Rob,

Many thanks for your message.

So you have experimental groups with just 1 biologcal sample in each group? I would not recommend this. More biological samples are required, especially for statistical analyses.

However, even if there was 1 sample in each group, the same process would apply. The calibrator sample can be the sole control sample. Then everything else is compared to that.

Hope that makes sense.

Steven

Many Thanks for this topic !!!

Before this, it was very difficult to find something complete, precise and clear on the qPCR analysis with 2 reference genes and the integration of efficacity.

Very good job, thx a lot

Hi Cecile,

Many thanks for your comment, I really appreciate it!

All the best with your research.

Steven

Hello !

Thanks a lot for the clear explanation !

I do have a few questions, though…

I noticed that if I use the same set of data but use them for:

* the usual 2^-∆∆Ct (giving me “fold expression values”) (can I replace 2 by a calculated E value by the way?) OR

* the Pfaffl method OR multiple HKG method (giving me “gene expression ratios” or “relative gene expression”),

I end up with completely different looking graphs !

=> is that normal ?

=> if the answer is yes, is there any way to compare/transform “fold expresion” and “gene expression” values ?

Thanks a lot in advance

Hi Katy

Many thanks for your comments. Below is the answers I can think of for your questions:

1) Yes, you can replace the 2 in the Delta Delta Ct method with a different primer efficiency. However, this will again assume that both the gene of interest and reference (housekeeping) gene have the same amplification factor. If they are different, then this is why the Pfaffl method is preferred.

2) The equations are very similar, however, there can be differences in terms of the output between then. Obviously, since the Pfaffl equation and Vandesompele methods require individual primer efficiencies for both the genes of interest and reference genes then this can affect the outcome.

My question to ask is how different are your primer efficiencies?

Thanks,

Steven

Hello !

Thanks for your quick answer ! Your help is very much appreciated !

I read that people usually pool cDNA together to get a E value but in our case, we tried to keep the samples separated and did serial dilution with each cDNA. What we did with the Delta Delta Ct method was that we averaged the E values for both the CTL and the TEST samples and used this AvgE instead of 2. As for the HKG and GOI E values, well, they are quite similar BUT…

We have worked with different couples of primers in two different cDNA tubes (prepared the same day but based on different bacterial growth conditions) and we consistently have:

1) low efficency values (around 75%)

2) different efficency values (can be +-20%) for the same couples of primers

Why would efficiency values be different for the same couple of primers on different cDNA ???

Could that come from the RT step ? It does not always goes in the same direction, though (in the same experiment, sometimes the E values is higher with cDNA 1, and sometimes it is with cDNA2).

I am really confused…

Hi Katy,

I have replied to these questions on your e-mail.

Thanks,

Steven

Hi Steven,

I have found this page very useful – thanks for the great info !

I wonder though if you could help me…

I am currently writing up a report for some work I did in the lab during a short placement…I ran multiple qPCRs without first calculating my primer efficiency (as I didn’t realise at the time this is something I had to do).

I used both GAPDH and BETA ACTIN as my housekeeping genes (HK) but there is some small variation both between each of the HK genes and between samples. I cannot get back to the lab to determine my primer efficiencies therefore I cannot calculate the geometric means of these reference genes.

Therefore, when calculating ΔCTs, would it be best for me to take an average of the housekeeping genes from ALL the different samples on my plate and subtract this from each samples average Ct to get my ΔΔCT (so the same average HK value is subtracted from the average Ct for each different sample) OR would it be better to use the average HK genes for each sample (even though there is some variation)

Sorry if I’ve made this more confusing than necessary !

Erin

Hi Erin,

Many thanks for your comment and sorry for the delay.

If you do not have primer efficiencies then you could use the Vandesompele method and presume all of your experimental genes have an efficiency of 100% (as in the Delta Delta Ct method), by using the amplifcation efficiency of 2. This will then enable you to calculate the geometric mean for the housekeeping genes.

Obviously, this is not ideal since you do not know if your assays are the same (they are probably not), but this will account for multiple housekeeping genes.

I hope that helps.

Best wishes,

Steven