calculate_sample_corr_distr: Calculates correlation for all pairs of the samples in data...

View source: R/correlation-based_diagnostics.R

calculate_sample_corr_distrR Documentation

Calculates correlation for all pairs of the samples in data matrix, labels as replicated/same_batch/unrelated in output columns (see "Value").

Description

Calculates correlation for all pairs of the samples in data matrix, labels as replicated/same_batch/unrelated in output columns (see "Value").

Usage

calculate_sample_corr_distr(data_matrix, sample_annotation,
  repeated_samples = NULL, biospecimen_id_col = "EarTag",
  sample_id_col = "FullRunName", batch_col = "MS_batch")

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

repeated_samples

vector of sample IDs to evaluate, if NULL, all samples are taken into account for plotting

biospecimen_id_col

column in sample_annotation that defines a unique bio ID, which is usually a combination of conditions or groups. Tip: if such ID is absent, but can be defined from several columns, create new biospecimen_id column

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

batch_col

column in sample_annotation that should be used for batch comparison (or other, non-batch factor to be mapped to color in plots).

Value

dataframe with the following columns, that are suggested to use for plotting in plot_sample_corr_distribution as plot_param:

  1. replicate

  2. batch_the_same

  3. batch_replicate

  4. batches

other columns are:

  1. sample_id_1 & sample_id_2, both generated from sample_id_col variable

  2. correlation - correlation of two corresponding samples

  3. batch_1 & batch_2 or analogous, created the same as sample_id_1

Examples

corr_distribution = calculate_sample_corr_distr(data_matrix = example_proteome_matrix, 
sample_annotation = example_sample_annotation,
batch_col = 'MS_batch',biospecimen_id_col = "EarTag")


symbioticMe/proBatch documentation built on April 9, 2023, 11:59 a.m.