groupComparison: Whole plot testing

View source: R/groupComparison.R

groupComparisonR Documentation

Whole plot testing

Description

Whole plot testing

Usage

groupComparison(
  contrast.matrix,
  data,
  save_fitted_models = TRUE,
  log_base = 2,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL,
  numberOfCores = 1
)

Arguments

contrast.matrix

comparison between conditions of interests.

data

name of the (output of dataProcess function) data set.

save_fitted_models

logical, if TRUE, fitted models will be added to the output.

log_base

base of the logarithm used in dataProcess.

use_log_file

logical. If TRUE, information about data processing will be saved to a file.

append

logical. If TRUE, information about data processing will be added to an existing log file.

verbose

logical. If TRUE, information about data processing wil be printed to the console.

log_file_path

character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If 'append = TRUE', has to be a valid path to a file.

numberOfCores

Number of cores for parallel processing. When > 1, a logfile named 'MSstats_groupComparison_log_progress.log' is created to track progress. Only works for Linux & Mac OS. Default is 1.

Details

contrast.matrix : comparison of interest. Based on the levels of conditions, specify 1 or -1 to the conditions of interests and 0 otherwise. The levels of conditions are sorted alphabetically. Command levels(QuantData$FeatureLevelData$GROUP_ORIGINAL) can illustrate the actual order of the levels of conditions. The underlying model fitting functions are lm and lmer for the fixed effects model and mixed effects model, respectively. The input of this function is the quantitative data from function (dataProcess).

Value

A list with the following components:

ComparisonResult

A 'data.frame' containing the results of the statistical testing for each protein. The columns include:

Protein

The name of the protein for which the comparison is made.

Label

The label of the comparison, typically derived from the 'contrast.matrix'.

log2FC

The log2 fold change between the conditions being compared. The base of the logarithm is specified by the 'log_base' parameter.

SE

The standard error of the log2 fold change estimate.

Tvalue

The t-statistic value for the comparison.

DF

The degrees of freedom associated with the t-statistic.

pvalue

The p-value for the statistical test of the comparison.

adj.pvalue

The adjusted p-value using the Benjamini-Hochberg method for controlling the false discovery rate.

issue

Any issues encountered during the comparison. NA indicates no issues. "oneConditionMissing" occurs when data for one of the conditions being compared is entirely missing for a particular protein.

MissingPercentage

The percentage of missing features for a given protein across all runs. This column is included only if missing values were imputed.

ImputationPercentage

The percentage of features that were imputed for a given protein across all runs. This column is included only if missing values were imputed.

ModelQC

A 'data.frame' containing quality control data used to fit models for group comparison. The columns include:

RUN

Identifier for the specific MS run.

Protein

Identifier for the protein.

ABUNDANCE

Summarized intensity for the protein in a given run.

originalRUN

Original run identifier before any processing.

GROUP

Experimental group identifier.

SUBJECT

Subject identifier within the experimental group.

TotalGroupMeasurements

Total number of feature measurements for the protein in the given group.

NumMeasuredFeatures

Number of features measured for the protein in the given run.

MissingPercentage

Percentage of missing feature values for the protein in the given run.

more50missing

Logical indicator of whether more than 50 percent of the features values are missing for the protein in the given run.

NumImputedFeature

Number of features for which values were imputed due to missing or censored data for the protein in the given run.

residuals

Contains the differences between the observed values and the values predicted by the fitted model.

fitted

The predicted values obtained from the model for a protein measurement for a given run in the dataset.

FittedModel

A list of fitted models for each protein. This is included only if 'save_fitted_models' is set to TRUE. Each element of the list corresponds to a protein and contains the fitted model object.

Examples

# Consider quantitative data (i.e. QuantData) from yeast study with ten time points of interests, 
# three biological replicates, and no technical replicates. 
# It is a time-course experiment and we attempt to compare differential abundance
# between time 1 and 7 in a set of targeted proteins. 
# In this label-based SRM experiment, MSstats uses the fitted model with expanded scope of 
# Biological replication.  
QuantData <- dataProcess(SRMRawData, use_log_file = FALSE)
head(QuantData$FeatureLevelData)
levels(QuantData$ProteinLevelData$GROUP)
comparison <- matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1)
row.names(comparison) <- "T7-T1"
groups = levels(QuantData$ProteinLevelData$GROUP)
colnames(comparison) <- groups[order(as.numeric(groups))]
# Tests for differentially abundant proteins with models:
# label-based SRM experiment with expanded scope of biological replication.
testResultOneComparison <- groupComparison(contrast.matrix=comparison, data=QuantData,
                                           use_log_file = FALSE)
# table for result
testResultOneComparison$ComparisonResult


MeenaChoi/MSstats documentation built on Nov. 10, 2024, 2:43 p.m.