estimatePriorDfRobust: Assess the Goodness of Fit of Mean-Variance Curves in a...

View source: R/robustMeanVarCurve.R

estimatePriorDfRobustR Documentation

Assess the Goodness of Fit of Mean-Variance Curves in a Robust Manner

Description

Given a set of bioCond objects of which each has been associated with a mean-variance curve, estimatePriorDfRobust derives a common number of prior degrees of freedom assessing the overall goodness of fit of the mean-variance curves and accordingly adjusts the variance ratio factor of each of the bioConds. Compared with estimatePriorDf, the underlying methods of estimatePriorDfRobust for parameter estimation are robust to outliers.

Usage

estimatePriorDfRobust(
  conds,
  occupy.only = TRUE,
  p_low = 0.01,
  p_up = 0.1,
  d0_low = 0.001,
  d0_up = 1e+06,
  eps = d0_low,
  nw = gauss.quad(128, kind = "legendre"),
  return.d0 = FALSE,
  no.rep.rv = NULL,
  .call = TRUE
)

Arguments

conds

A list of bioCond objects, of which each has a fit.info field describing its mean-variance curve (see also fitMeanVarCurve).

occupy.only

A logical scalar. If it is TRUE (default), only occupied intervals are used to estimate the number of prior degrees of freedom and adjust the variance ratio factors. Otherwise, all intervals are used.

p_low, p_up

Lower- and upper-proportions of extreme values to be Winsorized (see "References"). Each of them must be strictly larger than 0, and their sum must be strictly smaller than 1.

d0_low, d0_up

Positive reals specifying the lower and upper bounds of estimated d0 (i.e., number of prior degrees of freedom). Inf is not allowed.

During the estimation process, if d0 is sure to be less than or equal to d0_low, it will be considered as 0, and if it is sure to be larger than or equal to d0_up, it will be considered as positive infinity.

eps

The required numeric precision for estimating d0.

nw

A list containing nodes and weights variables for calculating the definite integral of a function f over the interval [-1, 1], which is approximated by sum(nw$weights * f(nw$nodes)). By default, a set of Gauss-Legendre nodes along with the corresponding weights calculated by gauss.quad is used.

return.d0

A logical scalar. If set to TRUE, the function simply returns the estimated d0.

no.rep.rv

A positive real specifying the variance ratio factor of those bioConds without replicate samples, if any. By default, it's set to the geometric mean of variance ratio factors of the other bioConds.

.call

Never care about this argument.

Details

The core function of estimatePriorDfRobust is very similar to that of estimatePriorDf, except that the former estimates the number of prior degrees of freedom and variance ratio factors in a robust manner (see also "References").

Unlike estimatePriorDf, you need to call explicitly estimatePriorDfRobust if you are intended to perform robust parameter estimation after associating a mean-variance curve with a set of bioCond objects (via fitMeanVarCurve for example; see "Examples" below).

Value

By default, estimatePriorDfRobust returns the argument list of bioCond objects, with the estimated number of prior degrees of freedom substituted for the "df.prior" component of each of them. Besides, their "ratio.var" components have been adjusted accordingly, and an attribute named "no.rep.rv" is added to the list if it's ever been used as the variance ratio factor of the bioConds without replicate samples. A special case is that the estimated number of prior degrees of freedom is 0. In this case, estimatePriorDfRobust won't adjust existing variance ratio factors, and you may want to use setPriorDfVarRatio to explicitly specify variance ratio factors.

If return.d0 is set to TRUE, estimatePriorDfRobust simply returns the estimated number of prior degrees of freedom.

References

Tukey, J.W., The future of data analysis. The annals of mathematical statistics, 1962. 33(1): p. 1-67.

Phipson, B., et al., Robust Hyperparameter Estimation Protects against Hypervariable Genes and Improves Power to Detect Differential Expression. Annals of Applied Statistics, 2016. 10(2): p. 946-963.

See Also

bioCond for creating a bioCond object; fitMeanVarCurve for fitting a mean-variance curve and using a fit.info field to characterize it; estimatePriorDf for the ordinary (non-robust) version of estimatePriorDfRobust; setPriorDfRobust for setting the number of prior degrees of freedom and accordingly adjusting the variance ratio factors of a set of bioConds in a robust manner.

Examples

data(H3K27Ac, package = "MAnorm2")
attr(H3K27Ac, "metaInfo")

## Estimate parameters regarding the associated mean-variance curve in a
## robust manner. Here we treat each cell line (i.e., individual) as a
## biological condition.

# Perform MA normalization and construct bioConds to represent cell lines.
norm <- normalize(H3K27Ac, 4, 9)
norm <- normalize(norm, 5:6, 10:11)
norm <- normalize(norm, 7:8, 12:13)
conds <- list(GM12890 = bioCond(norm[4], norm[9], name = "GM12890"),
              GM12891 = bioCond(norm[5:6], norm[10:11], name = "GM12891"),
              GM12892 = bioCond(norm[7:8], norm[12:13], name = "GM12892"))
autosome <- !(H3K27Ac$chrom %in% c("chrX", "chrY"))
conds <- normBioCond(conds, common.peak.regions = autosome)

# Fit a mean-variance curve by using the parametric method.
conds <- fitMeanVarCurve(conds, method = "parametric", occupy.only = TRUE)

# Estimate the associated number of prior degrees of freedom and variance
# ratio factors in a robust manner.
conds2 <- estimatePriorDfRobust(conds, occupy.only = TRUE)

# In this case, there is little difference in estimation results between the
# ordinary routine and the robust one.
sapply(conds, function(x) c(x$fit.info$df.prior, x$fit.info$ratio.var))
sapply(conds2, function(x) c(x$fit.info$df.prior, x$fit.info$ratio.var))


MAnorm2 documentation built on Oct. 29, 2022, 1:12 a.m.