gwpcrpois.est: Compatibility wrapper of 'gwpcrpois.est'

Description Usage Arguments Details Value Functions See Also

View source: R/estimate.R

Description

Estimates the parameters efficiency and lambda0 from a vector of observed read counts per molecular family, or (depending on the estimation method) the mean and variance of these observations. Supports arbitrary detection thresholds and initial molecule counts, but estimation may be considerably faster in the (unrealistic) case threshold=0 than in the general one.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
gwpcrpois.mom(
  mean,
  var,
  threshold = 1,
  molecules = 1,
  ctrl = list(),
  nonconvergence.is.error = FALSE
)

gwpcrpois.mle(c, threshold = 1, molecules = 1)

gwpcrpois.est(
  x = NULL,
  mean = NULL,
  var = NULL,
  n.umis = NULL,
  method = "mom",
  must.converge = TRUE,
  threshold = 1,
  molecules = 1,
  loss = expression(p0),
  ctrl = list()
)

Arguments

mean

average number of observations per molecular family computed over the unambiguously detected famililies, i.e. over those families which were observed at least threshold times. This parameter and specifying an observation vector through parameter x are mutually exclusive, and specifying mean and var instead of the full observation vector is only possible for estimation method 'mom'.

var

standard deviations of number of observations per molecular family, also computed over the unambiguously detected famililies, i.e. over those families which were observed at least threshold times. This parameter and specifying an observation vector through parameter x are mutually exclusive, and specifying mean and var instead of the full observation vector is only possible for estimation method 'mom'.

threshold

minimal number of observations a molecular family must have to count as unambiguously detected. Setting this to a value v >= 0 conditions the distribution on c >= c, i.e every value of c less than that gets assigned probability zero.

molecules

initial copy number

ctrl

a list of settings controlling the estimation procedure. Difference estimation methods recognize different possible ctrl settings, unrecognized settings are ignored without warning. See Details for the settings relevant to each estimation method.

nonconvergence.is.error

synonym for must.converge

c

number of observations of a particular molecular family

x

a numeric vector containing the observed read counts per molecular family, after removal of families below the detection threshold. The vector must thus contain only whole numbers not smaller than threshold. This parameter and the combination of parameters mean and var are mutually exclusive.

n.umis

the number of observed UMIs, used in the estimation of n.tot (i.e. the total number of molecules/UMIs in the original sample). See also the dicussion of the parameter loss, and the result value n.tot.

method

the estimation method to use, either 'mle' for maximum likelihood estimation or 'mom' for method of moments. See Details.

must.converge

if set to TRUE, an error is reported of the parameter estimation fails to converge. If FALSE, a warning is reported instead.

loss

an expression specifying how the loss, i.e. the percentage of all molecules (or UMIs) that was not observed, or removed by the read count threshold. In the simple case of each read count observation representing a separate molecules, the default value p0 is correct – the lost molecules are then simply those whose read count lies below the specified threshold. In more complex scenarios, e.g. if a single molecule produces separate read count for each strand, which are then either both rejected or both accepted, the additional rejection cases must be considered by a custom loss expression

Details

The two available estimation methods, method of moments (method='mome') and maximum likelihood (method='mle') have different propertiers and accept different ctrl parameters:

method of moments (mom):

For the (unrealistic) uncensored case, i.e. threshold=0, the specified mean is the method-of-moments estimate for lambda0, and a closed formula is used to compute efficiency from mean and var. The ctrl argument is not used in this case.

In case of a censored distribution, the sample mean is not a consistent estimator for lambda0 because the expectation of the censored distribution is in general larger than lambda0. The sample variance simiarly deviates from the variance of the uncensored distribution.

An interative approach is used to find method-of-moment estimates in this case. Initial estimates are computed as if threshold were zero. From these the probability pdetect (i.e. 1-p0) of detecting a particular family is found, and used to correct for the biases in the sample mean and variance. Then the parameter estimates are updated. This process continues until it either converges or reaches the maximum allowed number of iterations. Both termination criteria can be controlled via the ctrl parameter, which is a list that can contain the following components:

maxit

Maximum number of iterations. Defaults to 150.

rel.tol

Relative convergence tolerance. Applied to efficiency, lambda0 and the detection probability pdetect. Defaults to 1e-4.

rel.tol

Absolute convergence tolerance. Only used as the tolerance around zero, where the relative tolerance becomes meaningless. Defaults to 1e-4.

trace

Output estimates after each round

maximum likelihood (mle):

The parameters are estimated by maximizing the log-likelihood using optim. The ctrl settings are passed through to optim, except for fnscale and parscale which are overwritten. The method of moments (method 'mom')) estimate is used as the starting point during liklihood optimization.

Value

A list containing the values

convergence

flag indicating whether the estimation converged. 0 indicates convergence.

efficiency

parameter estimate for efficiency (see gwpcrpois)

lambda0

parameter estimate for lambda0 (see gwpcrpois)

p0

probability of observing a read count less than the specified threshold

loss

the estimation loss according to the specified loss expression, i.e. percentage of molecules not observed or filtered out

n.tot

the estimated total number of molecules in the sample, i.e. n.ums / (1 - loss)

n.obs

The length of the observation vector x used for parameter estimation, or NA if mean and var were specified directly.

n.umis

The number of observed molecules/UMIs specified in the call to gwpcrpois.est

threshold

detection threshold specified in the call to gwpcrpois.est

molecules

initial molecule count specified in the call to gwpcrpois.est

Functions

See Also

gwpcrpois


Cibiv/gwpcR documentation built on Aug. 31, 2021, 1:20 p.m.