plot_sample_corr_distribution: Create violin plot of sample correlation distribution

Description Usage Arguments Value See Also Examples

View source: R/correlation-based_diagnostics.R

Description

Useful to visualize within batch vs within replicate vs non-related sample correlation

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
plot_sample_corr_distribution(
  data_matrix,
  sample_annotation,
  repeated_samples = NULL,
  sample_id_col = "FullRunName",
  batch_col = "MS_batch",
  biospecimen_id_col = "EarTag",
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = "Sample correlation distribution",
  plot_param = "batch_replicate",
  theme = "classic"
)

plot_sample_corr_distribution.corrDF(
  corr_distribution,
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = "Sample correlation distribution",
  plot_param = "batch_replicate",
  theme = "classic"
)

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

repeated_samples

if NULL, correlation of all samples is plotted

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

batch_col

column in sample_annotation that should be used for batch comparison (or other, non-batch factor to be mapped to color in plots).

biospecimen_id_col

column in sample_annotation that captures the biological sample, that (possibly) was profiled several times as technical replicates. Tip: if such ID is absent, but can be defined from several columns, create new biospecimen_id column

filename

path where the results are saved. If null the object is returned to the active window; otherwise, the object is save into the file. Currently only pdf and png format is supported

width

option determining the output image width

height

option determining the output image width

units

units: 'cm', 'in' or 'mm'

plot_title

title of the plot (e.g., processing step + representation level (fragments, transitions, proteins) + purpose (meanplot/corrplot etc))

plot_param

columns, defined in correlation_df, which is output of calculate_sample_corr_distr, specifically,

  1. replicate

  2. batch_the_same

  3. batch_replicate

  4. batches

theme

ggplot theme, by default classic. Can be easily overriden

corr_distribution

data frame with correlation distribution, as returned by calculate_sample_corr_distr

Value

ggplot type object with violin plot for each plot_param

See Also

calculate_sample_corr_distr, ggplot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
sample_corr_distribution_plot <- plot_sample_corr_distribution(
example_proteome_matrix,
example_sample_annotation, batch_col = 'MS_batch', 
biospecimen_id_col = "EarTag", 
plot_param = 'batch_replicate')

corr_distribution = calculate_sample_corr_distr(data_matrix = example_proteome_matrix, 
sample_annotation = example_sample_annotation,
batch_col = 'MS_batch',biospecimen_id_col = "EarTag")
sample_corr_distribution_plot <- plot_sample_corr_distribution.corrDF(corr_distribution,
plot_param = 'batch_replicate')

## Not run: 
sample_corr_distribution_plot <- plot_sample_corr_distribution.corrDF(corr_distribution,
plot_param = 'batch_replicate', 
filename = 'test_sampleCorr.png', 
width = 28, height = 28, units = 'cm')

## End(Not run)

proBatch documentation built on Nov. 8, 2020, 4:55 p.m.