calculate_PVCA: Calculate variance distribution by variable

Description Usage Arguments Value Examples

View source: R/proteome_wide_diagnostics.R

Description

Calculate variance distribution by variable

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
calculate_PVCA(
  data_matrix,
  sample_annotation,
  feature_id_col = "peptide_group_label",
  sample_id_col = "FullRunName",
  factors_for_PVCA = c("MS_batch", "digestion_batch", "Diet", "Sex", "Strain"),
  pca_threshold = 0.6,
  variance_threshold = 0.01,
  fill_the_missing = -1
)

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

feature_id_col

name of the column with feature/gene/peptide/protein ID used in the long format representation df_long. In the wide formatted representation data_matrix this corresponds to the row names.

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

factors_for_PVCA

vector of factors from sample_annotation, that are used in PVCA analysis

pca_threshold

the percentile value of the minimum amount of the variabilities that the selected principal components need to explain

variance_threshold

the percentile value of weight each of the factors needs to explain (the rest will be lumped together)

fill_the_missing

numeric value determining how missing values should be substituted. If NULL, features with missing values are excluded.

Value

data frame of weights of Principal Variance Components

Examples

1
2
3
4
matrix_test <- example_proteome_matrix[1:150, ]
pvca_df <- calculate_PVCA(matrix_test, example_sample_annotation, 
factors_for_PVCA = c('MS_batch', 'digestion_batch',"Diet", "Sex", "Strain"),
pca_threshold = .6, variance_threshold = .01, fill_the_missing = -1)

proBatch documentation built on Nov. 8, 2020, 4:55 p.m.