groupComparison: Whole plot testing
In Vitek-Lab/MSstats: Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

groupComparison

R Documentation

Whole plot testing

Description

Whole plot testing

Usage

groupComparison(
  contrast.matrix,
  data,
  save_fitted_models = TRUE,
  log_base = 2,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL,
  numberOfCores = 1
)

Arguments

`contrast.matrix`	comparison between conditions of interests.
`data`	name of the (output of dataProcess function) data set.
`save_fitted_models`	logical, if TRUE, fitted models will be added to the output.
`log_base`	base of the logarithm used in dataProcess.
`use_log_file`	logical. If TRUE, information about data processing will be saved to a file.
`append`	logical. If TRUE, information about data processing will be added to an existing log file.
`verbose`	logical. If TRUE, information about data processing wil be printed to the console.
`log_file_path`	character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If 'append = TRUE', has to be a valid path to a file.
`numberOfCores`	Number of cores for parallel processing. When > 1, a logfile named 'MSstats_groupComparison_log_progress.log' is created to track progress. Only works for Linux & Mac OS. Default is 1.

Details

contrast.matrix : comparison of interest. Based on the levels of conditions, specify 1 or -1 to the conditions of interests and 0 otherwise. The levels of conditions are sorted alphabetically. Command levels(QuantData$FeatureLevelData$GROUP_ORIGINAL) can illustrate the actual order of the levels of conditions. The underlying model fitting functions are lm and lmer for the fixed effects model and mixed effects model, respectively. The input of this function is the quantitative data from function (dataProcess).

Value

A list with the following components:

ComparisonResult

A 'data.frame' containing the results of the statistical testing for each protein. The columns include:

Protein

The name of the protein for which the comparison is made.

Label

The label of the comparison, typically derived from the 'contrast.matrix'.

log2FC

The log2 fold change between the conditions being compared. The base of the logarithm is specified by the 'log_base' parameter.

'log2FC = Inf' or '-Inf': This occurs when one condition has entirely missing measurements for a protein, resulting in an undefined ratio.
'log2FC' is a numeric value but all other columns are 'NA': This occurs when there is only one sample per condition. Fold change can be estimated, but variance cannot be estimated, so no statistical testing is possible.

SE

The standard error of the log2 fold change estimate. May be 'NA' when variance cannot be estimated (e.g., when only one sample per group).

Tvalue

The t-statistic value for the comparison. May be 'NA' when variance cannot be estimated (e.g., when only one sample per group).

DF

The degrees of freedom associated with the t-statistic. A value of 0 indicates that, although variance could be estimated, the total number of observations is too small to support hypothesis testing.

pvalue

The p-value for the statistical test of the comparison. Applicable if degrees of freedom is greater than 0

adj.pvalue

The adjusted p-value using the Benjamini-Hochberg method for controlling the false discovery rate.

issue

Any issues encountered during the comparison. NA indicates no issues. "oneConditionMissing" occurs when data for one of the conditions being compared is entirely missing for a particular protein.

MissingPercentage

The percentage of missing features for a given protein across all runs. This column is included only if missing values were imputed.

ImputationPercentage

The percentage of features that were imputed for a given protein across all runs. This column is included only if missing values were imputed.

ModelQC

A 'data.frame' containing quality control data used to fit models for group comparison. The columns include:

RUN: Identifier for the specific MS run.
Protein: Identifier for the protein.
ABUNDANCE: Summarized intensity for the protein in a given run.
originalRUN: Original run identifier before any processing.
GROUP: Experimental group identifier.
SUBJECT: Subject identifier within the experimental group.
TotalGroupMeasurements: Total number of feature measurements for the protein in the given group.
NumMeasuredFeatures: Number of features measured for the protein in the given run.
MissingPercentage: Percentage of missing feature values for the protein in the given run.
more50missing: Logical indicator of whether more than 50 percent of the features values are missing for the protein in the given run.
NumImputedFeature: Number of features for which values were imputed due to missing or censored data for the protein in the given run.
residuals: Contains the differences between the observed values and the values predicted by the fitted model.
fitted: The predicted values obtained from the model for a protein measurement for a given run in the dataset.

FittedModel

A list of fitted models for each protein. This is included only if 'save_fitted_models' is set to TRUE. Each element of the list corresponds to a protein and contains the fitted model object.

Examples

# Consider quantitative data (i.e. QuantData) from yeast study with ten time points of interests, 
# three biological replicates, and no technical replicates. 
# It is a time-course experiment and we attempt to compare differential abundance
# between time 1 and 7 in a set of targeted proteins. 
# In this label-based SRM experiment, MSstats uses the fitted model with expanded scope of 
# Biological replication.  
QuantData <- dataProcess(SRMRawData, use_log_file = FALSE)
head(QuantData$FeatureLevelData)
levels(QuantData$ProteinLevelData$GROUP)
comparison <- matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1)
row.names(comparison) <- "T7-T1"
groups = levels(QuantData$ProteinLevelData$GROUP)
colnames(comparison) <- groups[order(as.numeric(groups))]
# Tests for differentially abundant proteins with models:
# label-based SRM experiment with expanded scope of biological replication.
testResultOneComparison <- groupComparison(contrast.matrix=comparison, data=QuantData,
                                           use_log_file = FALSE)
# table for result
testResultOneComparison$ComparisonResult

Vitek-Lab/MSstats documentation built on April 14, 2025, 1:43 p.m.

Vitek-Lab/MSstats index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Vitek-Lab/MSstats
Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

groupComparison: Whole plot testing
In Vitek-Lab/MSstats: Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

Whole plot testing

Description

Usage

Arguments

Details

Value

Examples

Related to groupComparison in Vitek-Lab/MSstats...

R Package Documentation

Browse R Packages

We want your feedback!

Vitek-Lab/MSstats Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

groupComparison: Whole plot testing In Vitek-Lab/MSstats: Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

Whole plot testing

Description

Usage

Arguments

Details

Value

Examples

Related to groupComparison in Vitek-Lab/MSstats...

R Package Documentation

Browse R Packages

We want your feedback!

Vitek-Lab/MSstats
Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments

groupComparison: Whole plot testing
In Vitek-Lab/MSstats: Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments