plot_sample_mean_or_boxplot: Plot per-sample mean or boxplots for initial assessment
In proBatch: Tools for Diagnostics and Corrections of Batch Effects in Proteomics

Description Usage Arguments Details Value See Also Examples

Plot per-sample mean or boxplots (showing median and quantiles). In ordered samples, e.g. consecutive MS runs, order-associated effects are visualised.

plot_sample_mean(
  data_matrix,
  sample_annotation = NULL,
  sample_id_col = "FullRunName",
  batch_col = "MS_batch",
  color_by_batch = FALSE,
  color_scheme = "brewer",
  order_col = "order",
  vline_color = "grey",
  facet_col = NULL,
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = NULL,
  theme = "classic",
  ylimits = NULL
)

plot_boxplot(
  df_long,
  sample_annotation = NULL,
  sample_id_col = "FullRunName",
  measure_col = "Intensity",
  batch_col = "MS_batch",
  color_by_batch = TRUE,
  color_scheme = "brewer",
  order_col = "order",
  facet_col = NULL,
  filename = NULL,
  width = NA,
  height = NA,
  units = c("cm", "in", "mm"),
  plot_title = NULL,
  theme = "classic",
  ylimits = NULL,
  outliers = TRUE
)

`data_matrix`	features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. See "example_proteome_matrix" for more details (to call the description, use `help("example_proteome_matrix")`)
`sample_annotation`	data frame with: `sample_id_col` (this can be repeated as row names) biological covariates technical covariates (batches etc) . See `help("example_sample_annotation")`
`sample_id_col`	name of the column in `sample_annotation` table, where the filenames (colnames of the `data_matrix` are found).
`batch_col`	column in `sample_annotation` that should be used for batch comparison (or other, non-batch factor to be mapped to color in plots).
`color_by_batch`	(logical) whether to color points and connecting lines by batch factor as defined by `batch_col`.
`color_scheme`	named vector, names corresponding to unique batch values of `batch_col` in `sample_annotation`. Best created with sample_annotation_to_colors
`order_col`	column in `sample_annotation` that determines sample order. It is used for in initial assessment plots (plot_sample_mean_or_boxplot) and feature-level diagnostics (feature_level_diagnostics). Can be 'NULL' if sample order is irrelevant (e.g. in genomic experiments). For more details, order definition/inference, see define_sample_order and date_to_sample_order
`vline_color`	color of vertical lines, typically denoting different MS batches in ordered runs; should be `NULL` for experiments without intrinsic order
`facet_col`	column in `sample_annotation` with a batch factor to separate plots into facets; usually 2nd to `batch_col`. Most meaningful for multi-instrument MS experiments (where each instrument has its own order-associated effects (see `order_col`) or simultaneous examination of two batch factors (e.g. preparation day and measurement day). For single-instrument case should be set to 'NULL'
`filename`	path where the results are saved. If null the object is returned to the active window; otherwise, the object is save into the file. Currently only pdf and png format is supported
`width`	option determining the output image width
`height`	option determining the output image width
`units`	units: 'cm', 'in' or 'mm'
`plot_title`	title of the plot (e.g., processing step + representation level (fragments, transitions, proteins) + purpose (meanplot/corrplot etc))
`theme`	ggplot theme, by default `classic`. Can be easily overriden
`ylimits`	range of y-axis to compare two plots side by side, if required.
`df_long`	data frame where each row is a single feature in a single sample. It minimally has a `sample_id_col`, a `feature_id_col` and a `measure_col`, but usually also an `m_score` (in OpenSWATH output result file). See `help("example_proteome")` for more details.
`measure_col`	if `df_long` is among the parameters, it is the column with expression/abundance/intensity; otherwise, it is used internally for consistency.
`outliers`	keep (default) or remove the boxplot outliers

functions for quick visual assessment of trends associated, overall or specific covariate-associated (see batch_col and facet_col)

ggplot2 class object. Thus, all aesthetics can be overridden

ggplot, date_to_sample_order

mean_plot <- plot_sample_mean(example_proteome_matrix, example_sample_annotation, 
order_col = 'order', batch_col = "MS_batch")

color_list <- sample_annotation_to_colors (example_sample_annotation, 
factor_columns = c('MS_batch'),
numeric_columns = c('DateTime', 'order'))
plot_sample_mean(example_proteome_matrix, example_sample_annotation, 
order_col = 'order', batch_col = "MS_batch", color_by_batch = TRUE, 
color_scheme = color_list[["MS_batch"]])

## Not run: 
mean_plot <- plot_sample_mean(example_proteome_matrix, 
                              example_sample_annotation, 
                              order_col = 'order', batch_col = "MS_batch", 
                              filename = 'test_meanplot.png', 
                              width = 28, height = 18, units = 'cm')

## End(Not run)

boxplot <- plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch")

color_list <- sample_annotation_to_colors (example_sample_annotation, 
factor_columns = c('MS_batch'),
numeric_columns = c('DateTime', 'order'))
plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch", color_scheme = color_list[["MS_batch"]])

## Not run: 
boxplot <- plot_boxplot(log_transform_df(example_proteome), 
sample_annotation = example_sample_annotation, 
batch_col = "MS_batch", filename = 'test_boxplot.png', 
width = 14, height = 9, units = 'in')

## End(Not run)