freqpcr | R Documentation |
The function estimates the population allele frequency using the dataset of Cq values measured over length(N)
bulk samples, each of which has a sample size of N[i]
as the number of individuals included. N[i]
can be 1, meaning that every individual is analyzed separately. For the ith sample, the four Cq values were measured as housek0[i]
, target0[i]
, housek1[i]
, and target1[i]
. The function can estimate up to five parameters simultaneously when the Cq sets are available for more than two (bulk) samples: P
, K
, targetScale
, sdMeasure
, and EPCR
.
Since v0.3.2, user can also use an experimental ‘continuous model’ by specifying A
instead of N
. That is, each sample DNA is directly extracted from the environment and the sample allele ratio y
follows y ~ Beta(apk, a(1-p)k)
instead of y ~ Beta(mk, (n-m)k), m ~ Binomial(n, p)
, where p
and k
are the population allele frequency and the gamma shape parameter of the individual DNA yield, respectively. Each element of A
, a
is a scaling factor of relative DNA contents between the samples. The continuous model is likely when each sample directly comes from the environment e.g., water sampling in an eDNA analysis or cell culture in a petri dish.
Since v0.4.0, freqpcr()
also works without specifying housek0
and target0
i.e., it can estimate population allele frequency from ΔCq values instead of ΔΔCq. In this setting, the sizes of targetScale
and sdMeasure
should be fixed in addition to EPCR
and zeroAmount
.
freqpcr( N, A, housek0, target0, housek1, target1, P = NULL, K = NULL, targetScale = NULL, sdMeasure = NULL, EPCR = 0.99, XInit0 = c(P = NULL, K = NULL, targetScale = NULL, sdMeasure = NULL, EPCR = NULL), zeroAmount = NULL, beta = TRUE, diploid = FALSE, pvalue = 0.05, gradtol = 1e-04, steptol = 1e-09, iterlim = 100, maxtime = 600, print.level = 1, ... )
N |
Sample sizes as a numeric vector. |
A |
Use instead of |
housek0 |
A numeric vector. In RED-ΔΔCq method, |
target0 |
In RED-ΔΔCq method, |
housek1 |
The Cq values of the test sample measured on the housekeeping gene after the restriction enzyme digestion (in RED-ΔΔCq method), or the test sample amplified on the housekeeping gene (in general ΔΔCq analyses). |
target1 |
For each test sample with unknown allele-ratio, |
P |
Scalar. Population allele frequency from which the test samples are derived. Default is |
K |
Scalar. The gamma shape parameter of the individual DNA yield. Default is |
targetScale |
(δ_{T}) Scalar. The relative template DNA amount of the target locus to the houskeeping locus. If known, given as a positive numeric. |
sdMeasure |
(σ_{c}) Scalar. The measurement error (standard deviation) on each Cq value following Normal(0, σ_{c}^2). If known, given as a positive numeric. |
EPCR |
(η) Scalar. Amplification efficiency per PCR cycle. If known, given as a positive numeric. When |
XInit0 |
Optionally the initial value for the parameter optimization can be specified, but it is strongly recommended to keep the argument as is. Unlike |
zeroAmount |
(In RED-ΔΔCq method) residue rate of restriction enzyme digestion, or (in general ΔΔCq analyses) small portion of the off-target allele on the target locus of the test sample, which will be amplified in the PCR. It needs to be always specified by the user as a number between 0 and 1, usually near 0. |
beta |
Whether to use the beta distribution to approximate the sample allele ratio instead of specifying individual gamma distribution for each of the allelic DNA amounts? Default is |
diploid |
Is the target organism diploidy? Default is |
pvalue |
The two-sided confidence interval is calculated at the last iteration at given significance level. Default is 0.05, which returns the 95% Wald's CI (2.5 to 97.5 percentile) based on the Hessian matrix. |
gradtol, steptol, iterlim |
Control parameters passed to |
maxtime |
A positive scalar to set the maximum calculation time in seconds to abort the optimizer (and return error). The total calculation time largely depends on |
print.level |
|
... |
Additional arguments passed to the function. |
Object of the S4 class CqFreq. The slot report
is a matrix and each row contains the estimated parameter value with 100*(1-pvalue)% confidence interval. The following parameters are returned:
P
, the population allele frequency from which the test samples are derived.
K
, the gamma shape parameter of the individual DNA yield.
targetScale
(δ_{T}), the relative template DNA amount of the target to the houskeeping loci.
EPCR
(η), the amplification efficiency per PCR cycle.
sdMeasure
or "Cq measurement error" (σ_{c}).
Estimation is conducted only for parameters where the values are not specified or specified explicitly as NULL
. If one feeds a value e.g. K=1
or sdMeasure=0.24
, it is then treated as fixed parameter. Notwithstanding, EPCR
is estimated only when EPCR = NULL
is specified explicitly.
You must verify the size of EPCR
and zeroAmount
beforehand because they are not estimable from the experiments with unknown allele ratios. Although targetScale
and sdMeasure
are estimable within freqpcr()
, it is better to feed the known values, especially when you have only a few bulk samples (length(N) <= 3). Fixing targetScale
and sdMeasure
is also strongly recommended when housek0
and target0
are absent (ΔCq method). The functions knownqpcr()
or knownqpcr_unpaired()
, depending on your data format, provide the procedure to estimate the sizes of the experimental parameters using the DNA solutions of known allele mixing ratios.
For the unknown parameters, XInit0
optionally specifies initial values for the optimization using nlm()
though the use of internal default is strongly recommended. The specification as a fixed parameter has higher priority than the specification in XInit0
. Every user-specified parameter values, fixed or unknown, must be given in linear scale (e.g. between 0 and 1 for the allele frequency); they are transformed internally to log or logit.
Other estimation procedures:
knownqpcr_unpaired()
,
knownqpcr()
,
sim_dummy()
# Dummy Cq dataset with six bulk samples, each of which comprises of eight haploids. EPCR <- 0.95; zeroAmount <- 0.0016; # True values for the two must be known. P <- 0.75 dmy_cq <- make_dummy( rand.seed=1, P=P, K=2, ntrap=6, npertrap=8, scaleDNA=1e-07, targetScale=1.5, baseChange=0.3, EPCR=EPCR, zeroAmount=zeroAmount, sdMeasure=0.3, diploid=FALSE ) print(dmy_cq) # Estimation with freqpcr, where P, K, targetScale, and baseChange are marked unknown. result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0, housek1=dmy_cq@housek1, target1=dmy_cq@target1, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=0 ) print(result) # Estimation with freqpcr, assuming diploidy. result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0, housek1=dmy_cq@housek1, target1=dmy_cq@target1, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, diploid=TRUE ) # Estimation where you have knowledge on the size of K. result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0, housek1=dmy_cq@housek1, target1=dmy_cq@target1, K=2, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=2 ) # (>= v0.3.2) # Provided the model is continuous (specify A instead of N). result <- freqpcr( A=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0, housek1=dmy_cq@housek1, target1=dmy_cq@target1, K=2, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=1 ) # (>= v0.4.0) # If the dataset lacks control samples (housek0 and target0 are absent). # Fixing the sizes of targetScale and sdMeasure is strongly recommended. result <- freqpcr( N=dmy_cq@N, housek1=dmy_cq@housek1, target1=dmy_cq@target1, K=2, EPCR=EPCR, zeroAmount=zeroAmount, targetScale=1.5, sdMeasure=0.3, beta=TRUE, print.level=1 )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.