Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function averages the expression of several reference genes before calculation of gene expression ratios by ratiocalc
or ratiobatch
. The method is similar to that described in Vandesompele et al. (2002), but uses arithmetic averaging of threshold cycles/efficiencies and not geometric averaging of relative expression values. This is equivalent, as discussed in 'Details' and as shown in 'Examples'. An essential extension of this method is, that if replicates for the reference genes are supplied, the threshold cycles and efficiencies are subjected to error propagation prior to ratio calculation. The propagated error is then included in the calculation of the gene expression ratios, as advocated in Nordgard et al. (2006).
1 2 3 4 |
data |
multiple qPCR data generated by |
group |
a character vector defining the replicates (if any) of control/treatment samples and reference genes/genes-of-interest. See 'Details' |
which.eff |
efficiency calculated by which method. Defaults to sigmoidal fit. Can also be a value such as 1.8, as shown in 'Examples'. See |
type.eff |
using individual or averaged efficiencies for the replicates of a reference gene. See 'Details'. |
which.cp |
type of threshold cycles to be used for the analysis. Defaults to cpD2. See |
verbose |
logical. If |
... |
parameters to be supplied to |
As in ratiobatch
, the samples are to be defined as a character vector in the style of "g1s1", "g1c1", "r1s1" and "r1c1" etc. If refmean
is used as a standalone function or switched on in ratiobatch
using refmean = TRUE
, different reference genes per control/treatment samples are averaged when supplied either as single runs or as replicates.
Examples (omitting genes-of-interest in control/treatment samples):
2 reference genes, 2 replicates each:
c("r1s1", "r1s1, "r2s1", "r2s1", "r1c1", "r1c1, "r2c1", "r2c1", ...).
3 reference genes, no replicates:
c("r1s1", "r2s1, "r3s1", "r1c1", "r2c1, "r3c1", ...)
Averaging of multiple reference genes is accomplished the following way:
Given i reference genes with j replicates in a sample, all replicates r_{ij} are used for calculating mean μ_{r_i} and standard deviation σ_{r_i} of the threshold cycles and efficiencies. The overall (grand) mean μ_r and propagated error σ_r is calculated using propagate
with first-order Taylor expansion including covariance: σ_r = F_{r_i}C_{r_i}F_{r_i}^T. Finally, a vector of length L = n(r_{ij}) containing equidistant numbers X = (x_1, x_2, x_3, … x_L) with mean μ_r and standard deviation σ_r is generated for a new overall reference gene r_1. This is done using the internal function qpcR:::makeStat
which calculates a shifted (μ_r) and scaled (σ_r) Z-transformation on a vector x_1 … x_L:
Z_i = μ_r + \frac{(x_i - \bar{X})}{σ_X} \cdot σ_r
The new Z_i threshold cycle and efficiency values replace all values of r_{ij} in data
. When using ratiobatch
, this modified data is then used for the ratio calculation, again using propagate
to calculate errors for ratios using the Z_i values as mentioned above.
By using logarithmic identities (http://en.wikipedia.org/wiki/Logarithmic_identities), it can be shown that the geometric mean can be transformed to the arithmetic mean by logarithmation (assuming constant E):
≤ft( ∏_{i=1}^n E^{x_i} \right)^{\frac{1}{n}} = \frac{1}{n} \cdot \log_E ≤ft( ∏_{i=1}^n E^{x_i} \right) = \frac{1}{n} ∑_{i=1}^n x_i
Hence, arithmetic averaging of the threshold cycles BEFORE ratio calculation is the same as doing geometric averaging on relative quantities AFTER ratio calculation. This is demonstrated in 'Examples' and also mentioned in the geNorm manual (http://www.gene-quantification.com/geNorm_manual.pdf) in Q8 (page 12).
When setting type.eff = "individual"
(default), all efficiencies from replicates of a reference gene in a control/treatment sample E(r_{ij}) are used for calculating mean μ_{E(r_i)} and standard deviation σ_{E(r_i)}, the latter being used for calculating a propagated error for all reference gene efficiencies σ_{E(r)}. If type.eff = "mean.single"
, all E(r_{ij}) values from the replicates are set to the same value μ_{E(r_i)}, that is, there is no variation assumed between the different E(r_{ij}). In this case, σ_{E(r_i)} = 0, so that no error of the replicates is propagated to σ_{E(r)}. This results in smaller overall errors of the output, but it can be debated if this is a realistic approach, hence both settings were implemented.
which.eff
can be supplied with an efficiency value such as 1.8
, which is then used as the efficiency for all reference runs E(r_{ij}).
The same dataset data
which was supplied to the function, but with modified threshold cycle/efficiency values in which the values are created per sample in a way, that they have the mean of all averaged reference genes and the same standard deviation as obtained by error propagation. See 'Details' for a more thorough explanation. Furthermore, a modified label vector "NAME_mod"
is written to the global environment (if "NAME"
was supplied for group
) in which the different reference gene labels are aggregated, i.e. c("r1c1", "r2c1", "r3c1") will become c("r1c1", "r1c1", "r1c1"). This new label vector is also attached as an attribute to the output data and can be obtained by attr(RES1, "group")
.
The function checks stringently if the same number of different reference genes are used for control and treatment samples, although the number of replicates may differ.
Example:
GROUP <- c("r1c1", "r1c1", "r2c1", "r2c1", "r1s1", "r2s1") will work (2 reference genes in control/treatment samples), but GROUP <- c("r1c1", "r2c1", "r3c1", "r1s1", "r1s1", "r1s2", "r1s2", "r2s1", "r2s1") will not work (3 reference genes in controls, only 2 in treatment samples).
Also, when no or only one reference genes are detected, the original data is not averaged and returned unchanged.
Andrej-Nikolai Spiess
Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes.
Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F.
Genome Biol (2002), 3: research0034-research0034.11.
Error propagation in relative real-time reverse transcription polymerase chain reaction quantification models: the balance between accuracy and precision.
Nordgard O, Kvaloy JT, Farmen RK, Heikkil? R.
Anal Biochem (2006), 356: 182-193.
In ratiobatch
, reference gene averaging can be done automatically by setting refmean = TRUE
. See there.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ## Not run:
## Replacing the reference gene values by
## averaged ones in the original data.
## => RES1 is new dataset.
## => GROUP1_mod in global environment is
## new labeling vector.
DAT1 <- pcrbatch(reps, fluo = 2:19, model = l5)
GROUP1 <- c("r1c1", "r1c1", "r2c1", "r2c1", "g1c1", "g1c1",
"r1s1", "r1s1", "r1s2", "r1s2", "r2s1", "r2s1",
"r2s2", "r2s2", "g1s1", "g1s1", "g1s2", "g1s2")
RES1 <- refmean(DAT1, GROUP1, which.eff = "sig", which.cp = "cpD2")
## Using three reference genes without replicates
## and then 'ratiobatch'.
## This can also be called in 'ratiobatch' directly
## with parameter 'refmean = TRUE'. See there.
## In this example, already averaged dataset and
## new labeling vector are supplied to 'ratiobatch',
## so one has to set 'refmean = FALSE'.
DAT2 <- pcrbatch(reps, fluo = 2:9, model = l5)
GROUP2 <- c("r1c1", "r2c1", "r3c1", "g1c1", "r1s1", "r2s1", "r3s1", "g1s1" )
RES2 <- refmean(DAT2, GROUP2, which.eff = "sig", which.cp = "cpD2")
ratiobatch(RES2, GROUP2_mod, refmean = FALSE)
## Comparison between 'refmean' ct-value arithmetic averaging
## and 'geNorm' relative quantities geometric averaging
## using data from the geNorm manual (2008), page 6.
## We will use HK1-HK3 as in the manual (no replicates).
## First we create a 'pcrbatch' dataset and then
## override the ct values with those of the manual and all
## efficiencies with E = 2. Sample A is considered as control sample.
DAT3 <- pcrbatch(reps, fluo = 2:17, model = l5)
DAT3[8, -1] <- c(32.10, 27.00, 34.90, 23.00,
33.30, 28.40, 36.10, 24.20,
31.00, 27.50, 34.00, 26.35,
30.50, 28.20, 33.00, 25.45)
DAT3[1, -1] <- 2
GROUP3 <- c("r1c1", "r2c1", "r3c1", "g1c1",
"r1s1", "r2s1", "r3s1", "g1s1",
"r1s2", "r2s2", "r3s2", "g1s2",
"r1s3", "r2s3", "r3s3", "g1s3")
RES3 <- refmean(DAT3, GROUP3, which.eff = "sig", which.cp = "cpD2")
ratiobatch(RES3, GROUP3_mod, which.cp = "cpD2",
which.eff = "sig", refmean = FALSE)
## Results:
## r1c1:g1c1:r1s1:g1s1 refmean 1.0497
## geNorm 1.0472 (2.351/2.245)
## r1c1:g1c1:r1s2:g1s2 refmean 0.0693
## geNorm 0.0695 (0.156/2.245)
## r1c1:g1c1:r1s3:g1s3 refmean 0.1081
## geNorm 0.1074 (0.241/2.245)
## Slight differences are due to rounding.
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.