Frequently asked questions

knitr::opts_chunk$set(tidy=FALSE, cache=TRUE,
                      dev="png",
                      package.startup.message = FALSE,
                      message=FALSE, error=FALSE, warning=TRUE)
options(width=100)

Current GitHub issues

See GitHub page for up-to-date responses to users' questions.

Warnings

An intercept (i.e. mean term) must be specified order for the results to be statistically valid. Otherwise, the variance percentages will be very overestimated.

If a linear mixed model is used, all categorical variables must be modeled as a random effect. Alternatively, a fixed effect model can be used by modeling all variables as fixed.

Only one varying coefficient term can be specified. For example, the formula ~(Tissue+0|Individual) + (Batch+0|Individual) contains two varying coefficient terms and the results from this analysis are not easily interpretable. Only a formula with one term like (Tissue+0|Individual) is allowed.

Errors

Including variables that are highly correlated can produce misleading results (see Section "Detecting problems caused by collinearity of variables"). In this case, parameter estimates from this model are not meaningful. Dropping one or more of the covariates will fix this problem.

This arises when using a varying coefficient model that examines the effect of one variable inside subsets of the data defined by another: ~(A+0|B). See Section "Variation within multiple subsets of the data". There must be enough observations of each level of the variable B with each level of variable A. Consider an example with samples from multiple tissues from a set of individual where we are interested in the variation across individuals within each tissue using the formula: ~(Tissue+0|Individual). This analysis will only work if there are multiple samples from the same individual in at least one tissue. If all tissues only have one sample per individual, the analysis will fail and variancePartition will give this error.

When analyzing the variation of one variable inside another (see Section "Variation within multiple subsets of the data".), the formula most be specified as (Tissue+0|Individual). This error occurs when the formula contains (Tissue|Individual) instead.

This error occurs when fitVarPartModel uses too many threads and takes up too much memory. The easiest solution is to use fitExtractVarPartModel instead. Occasionally there is an issue in the parallel backend that is out of my control. Using fewer threads or restarting R will solve the problem.

Errors: Problems removing samples with NA/NaN/Inf values

variancePartition fits a regression model for each gene and drops samples that have NA/NaN/Inf values in each model fit. This is generally seamless but can cause an issue when a variable specified in the formula no longer varies within the subset of samples that are retained. Consider an example with variables for sex and age where age is NA for all males samples. Dropping samples with invalid values for variables included in the formula will retain only female samples. This will cause variancePartition to throw an error because there is now no variation in sex in the retained subset of the data. This can be resolved by removing either age or sex from the formula.

This situtation is indicated by the following errors:

Errors with BiocParallel multithreading backend



Try the variancePartition package in your browser

Any scripts or data that you put into this service are public.

variancePartition documentation built on Nov. 8, 2020, 5:18 p.m.