prepare_PVCA_df: prepare the weights of Principal Variance Components

Description Usage Arguments Value Examples

View source: R/proteome_wide_diagnostics.R

Description

prepare the weights of Principal Variance Components

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
prepare_PVCA_df(
  data_matrix,
  sample_annotation,
  feature_id_col = "peptide_group_label",
  sample_id_col = "FullRunName",
  technical_factors = c("MS_batch", "instrument"),
  biological_factors = c("cell_line", "drug_dose"),
  fill_the_missing = -1,
  pca_threshold = 0.6,
  variance_threshold = 0.01
)

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

feature_id_col

name of the column with feature/gene/peptide/protein ID used in the long format representation df_long. In the wide formatted representation data_matrix this corresponds to the row names.

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

technical_factors

vector sample_annotation column names that are technical covariates

biological_factors

vector sample_annotation column names, that are biologically meaningful covariates

fill_the_missing

numeric value determining how missing values should be substituted. If NULL, features with missing values are excluded. If NULL, features with missing values are excluded.

pca_threshold

the percentile value of the minimum amount of the variabilities that the selected principal components need to explain

variance_threshold

the percentile value of weight each of the covariates needs to explain (the rest will be lumped together)

Value

data frame with weights and factors, combined in a way ready for plotting

Examples

1
2
3
4
5
matrix_test <- example_proteome_matrix[1:150, ]
pvca_df_res <- prepare_PVCA_df(matrix_test, example_sample_annotation, 
technical_factors = c('MS_batch', 'digestion_batch'),
biological_factors = c("Diet", "Sex", "Strain"), 
pca_threshold = .6, variance_threshold = .01, fill_the_missing = -1)

proBatch documentation built on Nov. 8, 2020, 4:55 p.m.