View source: R/createFormula.R
createFormula | R Documentation |
Create model formula and corresponding data frame of variables for model fitting
createFormula(experiment_info, cols_fixed = NULL, cols_random = NULL)
experiment_info |
|
cols_fixed |
Argument specifying columns of |
cols_random |
Argument specifying columns of |
Creates a model formula and corresponding data frame of variables specifying the models
to be fitted. (Alternatively, createDesignMatrix
can be used to generate
a design matrix instead of a model formula.)
The output is a list containing the model formula and corresponding data frame of variables (one column per formula term). These can then be provided to differential testing functions that require a model formula, together with the main data object and contrast matrix.
The experiment_info
input (which was also previously provided to
prepareData
) should be a data frame containing all factors and covariates
of interest. For example, depending on the experimental design, this may include the
following columns:
group IDs (e.g. groups for differential testing)
block IDs (e.g. patient IDs in a paired design; these may be included as either fixed effect or random effects)
batch IDs (batch effects)
continuous covariates
sample IDs (e.g. to include random intercept terms for each sample, to account for overdispersion typically seen in high-dimensional cytometry data; this is known as an 'observation-level random effect' (OLRE); see see Nowicka et al., 2017, F1000Research for more details)
The arguments cols_fixed
and cols_random
specify the columns in
experiment_info
to include as fixed effect terms and random intercept terms
respectively. These can be provided as character vectors of column names, numeric
vectors of column indices, or logical vectors. The names for each formula term are
taken from the column names of experiment_info
.
Note that for some methods, random effect terms (e.g. for block IDs) must be provided
directly to the differential testing function instead (testDA_voom
and
testDS_limma
).
If there are no random effect terms, it will usually be simpler to use a design matrix
instead of a model formula; see createDesignMatrix
.
formula
: Returns a list with three elements:
formula
: model formula
data
: data frame of variables corresponding to the model formula
random_terms
: TRUE if model formula contains any random effect terms
# For a complete workflow example demonstrating each step in the 'diffcyt' pipeline,
# see the package vignette.
# Example: model formula
experiment_info <- data.frame(
sample_id = factor(paste0("sample", 1:8)),
group_id = factor(rep(paste0("group", 1:2), each = 4)),
patient_id = factor(rep(paste0("patient", 1:4), 2)),
stringsAsFactors = FALSE
)
createFormula(experiment_info, cols_fixed = "group_id", cols_random = c("sample_id", "patient_id"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.