plot_sample_mean_or_boxplot: Plot per-sample mean or boxplots for initial assessment

Description Usage Arguments Details Value See Also Examples

Description

Plot per-sample mean or boxplots (showing median and quantiles). In ordered samples, e.g. consecutive MS runs, order-associated effects are visualised.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
plot_sample_mean(
  data_matrix,
  sample_annotation = NULL,
  sample_id_col = "FullRunName",
  batch_col = "MS_batch",
  color_by_batch = FALSE,
  color_scheme = "brewer",
  order_col = "order",
  vline_color = "grey",
  facet_col = NULL,
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = NULL,
  theme = "classic",
  ylimits = NULL
)

plot_boxplot(
  df_long,
  sample_annotation = NULL,
  sample_id_col = "FullRunName",
  measure_col = "Intensity",
  batch_col = "MS_batch",
  color_by_batch = TRUE,
  color_scheme = "brewer",
  order_col = "order",
  facet_col = NULL,
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = NULL,
  theme = "classic",
  ylimits = NULL,
  outliers = TRUE
)

Arguments

data_matrix

features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use help("example_proteome_matrix"))

sample_annotation

data frame with:

  1. sample_id_col (this can be repeated as row names)

  2. biological covariates

  3. technical covariates (batches etc)

. See help("example_sample_annotation")

sample_id_col

name of the column in sample_annotation table, where the filenames (colnames of the data_matrix are found).

batch_col

column in sample_annotation that should be used for batch comparison (or other, non-batch factor to be mapped to color in plots).

color_by_batch

(logical) whether to color points and connecting lines by batch factor as defined by batch_col.

color_scheme

named vector, names corresponding to unique batch values of batch_col in sample_annotation. Best created with sample_annotation_to_colors

order_col

column in sample_annotation that determines sample order. It is used for in initial assessment plots (plot_sample_mean_or_boxplot) and feature-level diagnostics (feature_level_diagnostics). Can be 'NULL' if sample order is irrelevant (e.g. in genomic experiments). For more details, order definition/inference, see define_sample_order and date_to_sample_order

vline_color

color of vertical lines, typically denoting different MS batches in ordered runs; should be NULL for experiments without intrinsic order

facet_col

column in sample_annotation with a batch factor to separate plots into facets; usually 2nd to batch_col. Most meaningful for multi-instrument MS experiments (where each instrument has its own order-associated effects (see order_col) or simultaneous examination of two batch factors (e.g. preparation day and measurement day). For single-instrument case should be set to 'NULL'

filename

path where the results are saved. If null the object is returned to the active window; otherwise, the object is save into the file. Currently only pdf and png format is supported

width

option determining the output image width

height

option determining the output image width

units

units: 'cm', 'in' or 'mm'

plot_title

title of the plot (e.g., processing step + representation level (fragments, transitions, proteins) + purpose (meanplot/corrplot etc))

theme

ggplot theme, by default classic. Can be easily overriden

ylimits

range of y-axis to compare two plots side by side, if required.

df_long

data frame where each row is a single feature in a single sample. It minimally has a sample_id_col, a feature_id_col and a measure_col, but usually also an m_score (in OpenSWATH output result file). See help("example_proteome") for more details.

measure_col

if df_long is among the parameters, it is the column with expression/abundance/intensity; otherwise, it is used internally for consistency.

outliers

keep (default) or remove the boxplot outliers

Details

functions for quick visual assessment of trends associated, overall or specific covariate-associated (see batch_col and facet_col)

Value

ggplot2 class object. Thus, all aesthetics can be overridden

See Also

ggplot, date_to_sample_order

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
mean_plot <- plot_sample_mean(example_proteome_matrix, example_sample_annotation, 
order_col = 'order', batch_col = "MS_batch")

color_list <- sample_annotation_to_colors (example_sample_annotation, 
factor_columns = c('MS_batch'),
numeric_columns = c('DateTime', 'order'))
plot_sample_mean(example_proteome_matrix, example_sample_annotation, 
order_col = 'order', batch_col = "MS_batch", color_by_batch = TRUE, 
color_scheme = color_list[["MS_batch"]])

## Not run: 
mean_plot <- plot_sample_mean(example_proteome_matrix, 
                              example_sample_annotation, 
                              order_col = 'order', batch_col = "MS_batch", 
                              filename = 'test_meanplot.png', 
                              width = 28, height = 18, units = 'cm')

## End(Not run)

boxplot <- plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch")

color_list <- sample_annotation_to_colors (example_sample_annotation, 
factor_columns = c('MS_batch'),
numeric_columns = c('DateTime', 'order'))
plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch", color_scheme = color_list[["MS_batch"]])

## Not run: 
boxplot <- plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch", filename = 'test_boxplot.png', 
width = 14, height = 9, units = 'in')

## End(Not run)

proBatch documentation built on Nov. 8, 2020, 4:55 p.m.