A question that I often come across for those who are calculating relative gene expression values in qPCR is, how to go about using this method if there is more than one reference (housekeeping) gene?
A video tutorial on qPCR data analysis with multiple reference genes can be found in our Mastering qPCR course
>>Use code 20QPCR to get 20% off<<
There are a few ways to work with multiple reference genes in this instance. One way is to select the single best gene from the numerous ones tested to be used as the reference. This can be done by using a variety of software which can determine the best reference gene to use, such as geNorm (available in Biogazelle’s qbase+ program) or Normfinder (a free Excel add-on).
Assuming the multiple reference genes in question work very well and are not affected by the experimental conditions, it is possible to use them all to determine the relative gene expression levels. This approach was described by Vandesompele and others in 2002 and Hellemans and colleagues in 2007, both published in Genome Biology, which I thoroughly recommend reading.
The equation for using multiple reference genes to calculate the relative gene expression is displayed below.
The first thing I will say is: don’t panic! It is actually not as confusing as it looks. It is actually very similar to the Pfaffl equation, the only difference here being the geometric averaging of all the relative quantities (RQ), i.e. the (EREF)∆Ct REF part, of the multiple reference genes used on the denominator (bottom) part of the equation.
The E in the equation refers to the base of exponential amplification. A value of 2, like in the delta-delta Ct method, indicates that after each PCR cycle, the amount of product will double. In other words, a value represents a 100% efficient reaction.
How to use the equation
I will start with an example of a qPCR experiment, where I have Ct values for control and treated samples. I have performed qPCR using 2 reference genes (REF 1, REF 2) and my gene of interest (GOI). Each group has 3 biological repeats (1, 2 and 3). This could be a theoretical example of a cell culture experiment which has been repeated three times. Each qPCR was run in duplicate (technical repeats) and an average Ct value calculated, which are presented in the Ct column. The example data is presented below.
1. Calculate primer efficiencies
Like the Pfaffl method, the first thing that is required is to determine the primer efficiencies for your GOI and REF genes, in order to calculate the base of exponential amplification value. How to calculate primer efficiencies has been described in detail previously, so please refer to this post before continuing further.
Once you have the primer efficiencies, these will be in the format of a percentage, for example, 98%. However, this percentage is not entered directly into the equation, rather it needs to be converted.
A converted primer efficiency value of 2 indicates a 100% efficiency. This is the case when using the delta-delta Ct method. In other words, for every PCR cycle, the amount of DNA will multiply by 2. On the other hand, an efficiency of 90% would give a base of exponential amplification value of 1.90 and an efficiency of 110% would give a value of 2.10.
If you are still unsure, an easy way to convert the primer efficiency percentage is to divide the percentage by 100 and add 1.
For this example, I will pretend I have calculated the primer efficiency of my genes as follows:
- GOI = 1.93 (93%)
- REF 1 = 2.01 (101%)
- REF 2 = 1.97 (97%)
2. Select a calibrator sample to determine delta Ct (∆Ct)
The next step is to decide which sample, or group of samples, to use as a calibrator when calculating the ∆Ct values for all the samples. As mentioned previously, this is the part which confuses a lot of people.
A common way of doing this is to just match the experimental samples and determine the relative gene expression ratios separately. This is all well and true for experiments that have matched pairs, such as the case in cell culture experiments. However, this is difficult when the two experimental groups vary in n numbers and do not have matched pairs.
Another way is to select a sample with the highest or lowest GOI Ct value, reflecting the sample with the lowest or highest relative gene expression value respectively. This way, all the results will be relative to this sample.
I personally average the Ct values of the Control group biological replicates to create a ‘Control average Ct’. By doing so would mean that the results are presented relative to the control average Ct values.
Whichever sample, or group of samples, you use as your calibrator is fine so long as this is consistent throughout the analyses and is reported in the results so it is clear. Remember, the results produced at the end are relative gene expression values.
With this in mind, we next need to average the Control group Ct values for each gene.
So, for REF 1 this will be the average of 17.18, 16.96 and 17.11, which works out as 17.08. Repeating this for the REF 2 and GOI will give the following results.
3. Calculate delta Ct (∆Ct) values
Next, we need to calculate ∆Ct for all the samples within the different genes. The equation for ∆Ct can be found below.
To do this, simply subtract the sample Ct values from the calibrator Ct (in this example this will be the ‘Control average Ct‘ value).
So, to calculate the ∆Ct for the sample ‘Treated 1‘ for REF 2, you need to do 20.89 – 21.10, which equals -0.21. By repeating this for all the samples, for both genes, we get the results below.
4. Calculate relative quantity (RQ) values
The next step is to create RQ values for each sample, separately for each gene. The equation for calculating RQ is displayed below.
To show you one example, I will calculate the RQ for the Control 1 sample. For the REF 1 gene, I calculated the base of exponential amplification to be 2.01 (i.e. 101% efficiency). So the RQ in this sample will be 2.01-0.10 which comes to 0.99. For the REF 2 gene, I calculated the base of exponential amplification to be 1.97 (i.e. 97% efficiency). So the RQ in this sample will be 1.970.24 which comes to 1.18. And for the GOI, I calculated the base of exponential amplification to be 1.93 (i.e. 93% efficiency). So the RQ in this sample will be 1.930.08 which comes to 1.05. I have extended the results to repeat this analysis for all of the samples.
5. Calculate the geometric mean of the reference genes RQ values
This next step is the part which takes into account multiple reference genes. Specifically, the geometric mean of the reference gene RQ values must be created for each sample used. To do this in Excel, use the ‘=GEOMEAN‘ function.
For example, for the ‘Control 1‘ sample, this will be the geometric mean of 0.94 and 1.18. In Excel, the formula will be ‘=GEOMEAN(0.94,1.18)‘. If more reference genes were used in the experiment, then these RQ values can also be added on here too. The geometric mean of the aforementioned calculation gives 1.05.
I have calculated the geometric means of the two reference genes in the example (‘REF 1‘ and ‘REF 2‘) for all the samples below.
6. Calculate relative gene expression values
Finally, we now have all of the components to be able to calculate relative gene expression values. To calculate the relative gene expression values, simply divide the RQ of the GOI by the geometric mean of the RQ values for the reference genes (i.e. that created in the previous step).
Taking ‘Treated sample 1‘ as an example, the relative gene expression value will be 23.56 divided by 0.68, which gives 34.81.