Description Usage Arguments Value Examples
Perform robust differential abundance analysis directly on peptide abundances from label-free quantitative proteomics experiment. Or first do a robust summarization on these peptide abundances to protein abundances and perform a robust differential abundance analysis on these summarized protein abundances.
1 2 3 4 5 6 7 | msqrobsum(data, formulas, group_vars = "protein", contrasts = NULL,
mode = c("msqrobsum", "msqrob", "sum"), robust_lmer_iter = "auto",
squeeze_variance = TRUE, p_adjust_method = c("BH", p.adjust.methods),
keep_model = FALSE, rlm_args = list(maxit = 20L),
lmer_args = list(control = lmerControl(calc.derivs = FALSE)),
parallel_args = list(strategy = "multisession"),
type_df = "traceHat", squeeze_covariate = FALSE, fit_fun = do_mm)
|
data |
MSnset object or dataframe or with at least folowing 3 columns:
|
formulas |
Vector of formulas. These are the msqrob model specifications. A two-sided linear “lme4” formula object describing both the fixed-effects and random-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. Random-effects terms are distinguished by vertical bars (|) separating expressions for design matrices from grouping factors (eg. (1 | treatment)). See “lme4” package for more details. When multiple models are specified then the first formula is tried first to fit the data. When this fails, the second is tried, etc. |
group_vars |
Character vector of variable names. The variables used to group the data (eg. protein id). A model will be fitted for each group. |
contrasts |
Numeric matrix with contrasts or character with variable name. When a variable name is specified then this should correspond to categorial variable (eg. treatment) an should be specified in the model as a random effect. Every possible contrast will then be calculated. The contrast matrix should also only involve categorial parameters specified as random effect. This is because the reference level in the model can change between groups (eg. proteins) due to missing category levels |
mode |
Character. “'msqrobsum”' Summarization and MSqRob analysis is performed on the data. “'msqrob”' Only MSqRob analysis is performed on the data. “'sum”' Only Summarization is performed on the data. |
robust_lmer_iter |
Integer or “'auto”'. Number of iterations used for robust estimation in MSqRob (M-estimation with Huber weights). when set to “'auto”', defaults to 1 if “mode = msqrobsum” and 20 if “mode = msqrob” |
squeeze_variance |
Logical. “TRUE” if you want to squeeze the residual standard deviation of all models should be squeezed towards a common value |
p_adjust_method |
Character. Correction method for multiple testing. Defaults to "fdr". See “fdrtool::p.adjust” for more information an all available methods. |
keep_model |
Logical. “TRUE” (default) if you want to keep all lme4 models in the output. (memory heavy) |
rlm_args |
Named list. All parameters to be passed to the 'rlm' function used in the summarization step. Default parameters when empty list. See “MASS::rlm” for more information on all parameters and default settings. |
lmer_args |
Named list. All parameters to be passed to the 'lmer' function used in the MSqRob analysis. Default parameters when empty list. See “lme4::lmer” for more information on all parameters and default settings. |
parallel_args |
Named list. All parameters to be passed to the ‘plan' function from the 'future' package which allows for parallelization. Set “strategy = ’multisession” to allow parallelization using all available cores (default). Set the “workers” parameter to an integer to choose the number of cores to be used. Set “strategy = sequential” to disable parallelization. See “future::plan” for more information on all available perallelization strategies and other parameters with their default settings. |
A data frame. Following columns are present:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## Robust summarization from peptide intensities to protein summaries on
## the build-in data set with peptide intensities from 100 proteins.
## For only 100 proteins we will not benefit from parallezation because
## robust summarization is a fairly fast routine.
results1 <- msqrobsum( data = peptide_intensities
, mode = 'sum'
, group_vars = 'protein'
, parallel_args = list(strategy = 'sequential'))
## MSqRobSum analysis
## There are 20 samples belonging to 5 different conditions.
## Differential expression is tested between all conditions.
form = expression ~ (1|condition)
results2 <- msqrobsum(data = peptide_intensities
, formulas = form
, mode = 'msqrobsum'
, group_vars = 'protein'
, contrasts = 'condition'
, parallel_args = list(strategy = 'sequential'))
## MSqRob analysis
## There are 20 samples belonging to 5 different conditions.
## Since there is no prior summarization from peptide to protein intensities.
## The model has to take into account the sample and feature (peptide) effects
form = c(expression ~ (1|condition) + (1|sample) + (1|feature), expression ~ (1|condition))
## Differential expression is tested between all conditions.
## Fitting the full MSqRob models takes longer then the simplified models in MSqRobSum.
## Therefore it's suggested that you allow for parallelization,
## especially if you have big data sets with many samples and thousands of proteins.
## eg. if you have 2 available processing cores.
results3 <- msqrobsum(data = peptide_intensities
, formulas = form
, mode = 'msqrob'
, group_vars = 'protein'
, contrasts = 'condition'
, parallel_args = list(strategy = 'multisession', workers = 2))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.