A question that I often come across for those who are calculating relative gene expression values in qPCR is, how to go about using this method if there is more than one reference (housekeeping gene)?

There are a few ways to work with multiple reference genes in this instance. One way is to select the single best gene from the numerous ones tested to be used as the reference. This can be done by using a variety of software which can determine the best reference gene to use, such as geNorm (available in Biogazelle’s qbase+ program) or Normfinder (a free Excel add-on).

Assuming the multiple reference genes in question work very well and are not affected by the experimental conditions, it is possible to use them all to determine the relative gene expression levels. This approach was described by Vandesompele and others in 2002 and Hellemans and colleagues in 2007, both published in Genome Biology, which I thoroughly recommend reading.

A video for this article can be found in our online course: **Mastering qPCR**.

# The equation

The equation for using multiple reference genes to calculate the relative gene expression is displayed below.

The first thing I will say is: don’t panic! It is actually not as confusing as it looks. It is actually very similar to the Pfaffl equation, the only difference here being the geometric averaging of all the relative quantities (RQ), i.e. the (E_{REF})^{∆Ct REF} part, of the multiple reference genes used on the denominator (bottom) part of the equation.

The *E *in the equation refers to the base of exponential amplification. A value of 2, like in the delta-delta Ct method, indicates that after each PCR cycle, the amount of product will double. In other words, a value represents a 100% efficient reaction.

## How to use the equation

I will start with an example of a qPCR experiment, where I have Ct values for control and treated samples. I have performed qPCR using 2 reference genes (REF 1, REF 2) and my gene of interest (GOI). Each group has 3 biological repeats (1, 2 and 3). This could be a theoretical example of a cell culture experiment which has been repeated three times. Each qPCR was run in duplicate (technical repeats) and an average Ct value calculated, which are presented in the Ct column. The example data is presented below.

## 1. Calculate primer efficiencies

Like the Pfaffl method, the first thing that is required is to determine the primer efficiencies for your GOI and REF genes, in order to calculate the base of exponential amplification value. How to calculate primer efficiencies has been described in detail previously, so please refer to this post before continuing further.

Once you have the primer efficiencies, these will be in the format of a percentage, for example, 98%. However, this percentage is not entered directly into the equation, rather it needs to be converted.

A converted primer efficiency value of **2** indicates a 100% efficiency. This is the case when using the delta-delta Ct method. In other words, for every PCR cycle, the amount of DNA will multiply by **2**. On the other hand, an efficiency of 90% would give a base of exponential amplification value of **1.90** and an efficiency of 110% would give a value of **2.10**.

If you are still unsure, an easy way to convert the primer efficiency percentage is to divide the percentage by 100 and add 1.

For this example, I will pretend I have calculated the primer efficiency of my genes as follows:

- GOI =
**1.93**(93%) - REF 1 =
**2.01**(101%) - REF 2 =
**1.97**(97%)

## 2. Select a calibrator sample to determine delta Ct (∆Ct)

The next step is to decide which sample, or group of samples, to use as a calibrator when calculating the **∆Ct **values for all the samples. As mentioned previously, this is the part which confuses a lot of people.

A common way of doing this is to just match the experimental samples and determine the relative gene expression ratios separately. This is all well and true for experiments that have matched pairs, such as the case in cell culture experiments. However, this is difficult when the two experimental groups vary in n numbers and do not have matched pairs.

Another way is to select a sample with the highest or lowest GOI Ct value, reflecting the sample with the lowest or highest relative gene expression value respectively. This way, all the results will be relative to this sample.

I personally average the Ct values of the Control group biological replicates to create a **‘Control average Ct’**. By doing so would mean that the results are presented relative to the control average Ct values.

Whichever sample, or group of samples, you use as your calibrator is fine so long as this is consistent throughout the analyses and is reported in the results so it is clear. Remember, the results produced at the end are **relative** gene expression values.

With this in mind, we next need to average the Control group Ct values for each gene.

So, for REF 1 this will be the average of **17.18**, **16.96** and **17.11**, which works out as **17.08**. Repeating this for the REF 2 and GOI will give the following results.

## 3. Calculate delta Ct (∆Ct) values

Next, we need to calculate **∆Ct **for all the samples within the different genes. The equation for **∆Ct** can be found below.

To do this, simply subtract the sample Ct values from the calibrator Ct (in this example this will be the ‘**Control average Ct**‘ value).

So, to calculate the **∆Ct** for the sample ‘**Treated 1**‘ for REF 2, you need to do **20.89 – 21.10**, which equals **-0.21**. By repeating this for all the samples, for both genes, we get the results below.

## 4. Calculate relative quantity (RQ) values

The next step is to create RQ values for each sample, separately for each gene. The equation for calculating RQ is displayed below.

Where E in the equation refers to the base of exponential amplification (i.e. the efficiency of the reaction). Remember, these were calculated for each primer pair used in Step 1 above.

To show you one example, I will calculate the RQ for the Control 1 sample. For the REF 1 gene, I calculated the base of exponential amplification to be **2.01** (i.e. 101% efficiency). So the RQ in this sample will be **2.01 ^{-0.10}** which comes to 0.99. For the REF 2 gene, I calculated the base of exponential amplification to be

**1.97**(i.e. 97% efficiency). So the RQ in this sample will be

**1.97**which comes to

^{0.24}**1.18**. And for the GOI, I calculated the base of exponential amplification to be

**1.93**(i.e. 93% efficiency). So the RQ in this sample will be

**1.93**which comes to

^{0.08}**1.05**. I have extended the results to repeat this analysis for all of the samples.

## 5. Calculate the geometric mean of the reference genes RQ values

This next step is the part which takes into account multiple reference genes. Specifically, the geometric mean of the reference gene RQ values must be created for each sample used. To do this in Excel, use the ‘**=GEOMEAN**‘ function.

For example, for the ‘**Control 1**‘ sample, this will be the geometric mean of **0.94** and **1.18**. In Excel, the formula will be ‘**=GEOMEAN(0.94,1.18)**‘. If more reference genes were used in the experiment, then these RQ values can also be added on here too. The geometric mean of the aforementioned calculation gives **1.05**.

I have calculated the geometric means of the two reference genes in the example (‘**REF 1**‘ and ‘**REF 2**‘) for all the samples below.

## 6. Calculate relative gene expression values

Finally, we now have all of the components to be able to calculate relative gene expression values. To calculate the relative gene expression values, simply divide the RQ of the GOI by the geometric mean of the RQ values for the reference genes (i.e. that created in the previous step).

You will notice that this equation is the same one at the start of this article – just a simplified way of writing it.

Taking ‘**Treated sample 1**‘ as an example, the relative gene expression value will be **23.56** divided by **0.68**, which gives **34.81**.

May there be a mistake with the primer efficiency calculation?

if 100% efficiency for a pair of primers means the template would be increased by a factor of 2 than if you have 110% efficiency the amount of template would be increased by a factor of 2.2 and not 2.1.

Hi Uri,

Many thanks for your comment.

I understand your confusion. The amplification factor (E) of 2 represents a primer efficiency of 100%. To calculate the amplification factor, the equation of: 10^(-1/slope) is used. Where the slope is the slope of the line following the serial dilutions of a qPCR series. A slope of -3.1 gives an amplification factor of 2.1 and a primer efficiency of 110%.

I have just created a qPCR primer efficiency online calculator which does this for you. All you have to do is to enter the slope value.

I hope that helps!

Steven

Hi Steven,

How would this work if each group had 1 set of biological samples?

Hi Rob,

Many thanks for your message.

So you have experimental groups with just 1 biologcal sample in each group? I would not recommend this. More biological samples are required, especially for statistical analyses.

However, even if there was 1 sample in each group, the same process would apply. The calibrator sample can be the sole control sample. Then everything else is compared to that.

Hope that makes sense.

Steven

Many Thanks for this topic !!!

Before this, it was very difficult to find something complete, precise and clear on the qPCR analysis with 2 reference genes and the integration of efficacity.

Very good job, thx a lot

Hi Cecile,

Many thanks for your comment, I really appreciate it!

All the best with your research.

Steven