prnMDS: MDS plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

pepMDS

R Documentation

MDS plots

Description

pepMDS visualizes the multidimensional scaling (MDS) of peptide log2FC.

prnMDS visualizes the multidimensional scaling (MDS) of protein log2FC.

Usage

pepMDS(
  col_select = NULL,
  col_group = NULL,
  col_color = NULL,
  col_fill = NULL,
  col_shape = NULL,
  col_size = NULL,
  col_alpha = NULL,
  color_brewer = NULL,
  fill_brewer = NULL,
  size_manual = NULL,
  shape_manual = NULL,
  alpha_manual = NULL,
  choice = c("cmdscale", "isoMDS"),
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  dist_co = log2(1),
  adjEucDist = FALSE,
  method = "euclidean",
  p = 2,
  k = 3,
  dimension = 2,
  folds = 1,
  center_features = TRUE,
  scale_features = TRUE,
  show_ids = TRUE,
  show_ellipses = FALSE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  ...
)

prnMDS(
  col_select = NULL,
  col_group = NULL,
  col_color = NULL,
  col_fill = NULL,
  col_shape = NULL,
  col_size = NULL,
  col_alpha = NULL,
  color_brewer = NULL,
  fill_brewer = NULL,
  size_manual = NULL,
  shape_manual = NULL,
  alpha_manual = NULL,
  choice = c("cmdscale", "isoMDS"),
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  dist_co = log2(1),
  adjEucDist = FALSE,
  method = "euclidean",
  p = 2,
  k = 3,
  dimension = 2,
  folds = 1,
  center_features = TRUE,
  scale_features = TRUE,
  show_ids = TRUE,
  show_ellipses = FALSE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  ...
)

Arguments

`col_select`	Character string to a column key in `expt_smry.xlsx`. At the `NULL` default, the column key of `Select` in `expt_smry.xlsx` will be used. In the case of no samples being specified under `Select`, the column key of `Sample_ID` will be used. The non-empty entries under the ascribing column will be used in indicated analysis.
`col_group`	Character string to a column key in `expt_smry.xlsx`. Samples corresponding to non-empty entries under `col_group` will be used for sample grouping in the indicated analysis. At the NULL default, the column key `Group` will be used. No data annotation by groups will be performed if the fields under the indicated group column is empty.
`col_color`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `color` aesthetics in plots. At the NULL default, the column key `Color` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Color` and handle duplication in aesthetics).
`col_fill`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `fill` aesthetics in plots. At the NULL default, the column key `Fill` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Fill` and handle duplication in aesthetics).
`col_shape`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `shape` aesthetics in plots. At the NULL default, the column key `Shape` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Shape` and handle duplication in aesthetics).
`col_size`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `size` aesthetics in plots. At the NULL default, the column key `Size` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Size` and handle duplication in aesthetics).
`col_alpha`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `alpha` (transparency) aesthetics in plots. At the NULL default, the column key `Alpha` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Alpha` and handle duplication in aesthetics).
`color_brewer`	Character string to the name of a color brewer for use in ggplot2::scale_color_brewer, i.e., `color_brewer = Set1`. At the NULL default, the setting in `ggplot2` will be used.
`fill_brewer`	Character string to the name of a color brewer for use in ggplot2::scale_fill_brewer, i.e., `fill_brewer = Spectral`. At the NULL default, the setting in `ggplot2` will be used.
`size_manual`	Numeric vector to the scale of sizes for use in ggplot2::scale_size_manual, i.e., `size_manual = c(8, 12)`. At the NULL default, the setting in `ggplot2` will be used.
`shape_manual`	Numeric vector to the scale of shape IDs for use in ggplot2::scale_shape_manual, i.e., `shape_manual = c(5, 15)`. At the NULL default, the setting in `ggplot2` will be used.
`alpha_manual`	Numeric vector to the scale of transparency of objects for use in ggplot2::scale_alpha_manual , i.e., `alpha_manual = c(.5, .9)`. At the NULL default, the setting in `ggplot2` will be used.
`choice`	Character string; the MDS method in `c("cmdscale", "isoMDS")`. The default is "cmdscale".
`scale_log2r`	Logical; if TRUE, adjusts `log2FC` to the same scale of standard deviation across all samples. The default is TRUE. At `scale_log2r = NA`, the raw `log2FC` without normalization will be used.
`complete_cases`	Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.
`impute_na`	Logical; if TRUE, data with the imputation of missing values will be used. The default is FALSE.
`dist_co`	Numeric; The cut-off in the absolute distance measured by `d = abs(x_i - x_j)`. Data pairs, `x_i` and `x_j`, with corresponding `d` smaller than `dist_co` will be excluded from distance calculations by dist. The default is no distance cut-off at `dist_co = log2(1)`.
`adjEucDist`	Logical; if TRUE, adjusts the inter-plex `Euclidean` distance by `1/sqrt(2)` at `method = "euclidean"`. The option `adjEucDist = TRUE` may be suitable when `reference samples` from each TMT plex undergo approximately the same sample handling process as the samples of interest. For instance, `reference samples` were split at the levels of protein lysates. Typically, `adjEucDist = FALSE` if `reference samples` were split near the end of a sample handling process, for instance, at the stages immediately before or after TMT labeling. Also see online README, section MDS for a brief reasoning.
`method`	Character string; the distance measure in one of c("euclidean", "maximum", "manhattan", "canberra", "binary") for `dist`. The default method is "euclidean".
`p`	Numeric; The power of the Minkowski distance in `dist`. The default is 2.
`k`	Numeric; The desired dimension for the solution passed to `cmdscale`. The default is 3.
`dimension`	Numeric; The desired dimension for pairwise visualization. The default is 2.
`folds`	Not currently used. Integer; the degree of folding data into subsets. The default is one without data folding.
`center_features`	Logical; if TRUE, adjusts log2FC to center zero by features (proteins or peptides). The default is TRUE. Note the difference to data alignment with `method_align` in `standPrn` or `standPep` where log2FC are aligned by observations (samples).
`scale_features`	Logical; if TRUE, adjusts log2FC to the same scale of variance by features (protein or peptide entries). The default is TRUE. Note the difference to data scaling with `scale_log2r` where log2FC are scaled by observations (samples).
`show_ids`	Logical; if TRUE, shows the sample IDs in `MDS/PCA` plots. The default is TRUE.
`show_ellipses`	Logical; if TRUE, shows the ellipses by sample groups according to `col_group`. The default is FALSE.
`df`	The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an `id` among `c("pep_seq", "pep_seq_mod", "prot_acc", "gene")`. A primary file contains normalized peptide or protein data and is among `c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`. For analyses require the fields of significance p-values, the `df` will be one of `c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`.
`filepath`	A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of `id` in the `call`.
`filename`	A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is `.txt`. For image files, they are typically saved via `ggsave` or `pheatmap` where the image type will be determined by the extension of the file name.
`theme`	A ggplot2 theme, i.e., theme_bw(), or a custom theme. At the NULL default, a system theme will be applied.
`...`	`filter_`: Variable argument statements for the row filtration against data in a primary file linked to `df`. See also `normPSM` for the format of `filter_` statements. Additional parameters for `ggsave`: `width`, the width of plot; `height`, the height of plot `...`

Details

An Euclidean distance matrix of log2FC is returned by dist, followed by a metric (cmdscale) or non-metric (isoMDS) MDS. The default is metric MDS with the input dissimilarities being euclidean distances. Note that the center_features alone will not affect the results of dist; it together with scale_features will be passed to scale.

Value

MDS plots.

Examples


# ===================================
# MDS
# ===================================

## !!!require the brief working example in `?load_expts`

# global option
scale_log2r <- TRUE

## peptides
# all samples
pepMDS(
  col_select = Select, 
  filter_peps_by = exprs(pep_n_psm >= 10),
  show_ids = FALSE, 
  filename = "peps_rowfil.png",
)

# selected samples
pepMDS(
  col_select = BI, 
  col_shape = Shape,   
  col_color = Alpha, 
  filter_peps_by = exprs(pep_n_psm >= 10),
  show_ids = FALSE, 
  filename = "peps_rowfil_colsel.png",
)

# column `Alpha` will be used at the default of
# `col_alpha = NULL`;
# To bypass the aesthetics under column `Alpha`, 
# use `col_alpha = NA`
# (the same applies to other aesthetics, and PCA and LDA)
pepMDS(
  col_select = Select, 
  col_alpha = NA, 
  filter_peps_by = exprs(pep_n_psm >= 10),
  show_ids = FALSE, 
  filename = "peps_rowfil_no_alpha.png",
)


## proteins
prnMDS(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filename = "prns_rowfil.png",
)

# custom palette
prnMDS(
  col_shape = Shape,
  color_brewer = Set1,
  show_ids = FALSE,
  filename = "my_palette.png",
)

## additional row filtration by pVals (proteins, impute_na = FALSE)
# if not yet, run prerequisitive significance tests at `impute_na = FALSE`
pepSig(
  impute_na = FALSE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = FALSE)

# (`W16_vs_W2.pVal (W16-W2)` now a column key)
prnMDS(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pvalcutoff.png, 
)

# analogous peptides
pepMDS(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pvalcutoff.png, 
)

## additional row filtration by pVals (proteins, impute_na = TRUE)
# if not yet, run prerequisitive NA imputation
pepImp(m = 2, maxit = 2)
prnImp(m = 5, maxit = 5)

# if not yet, run prerequisitive significance tests at `impute_na = TRUE`
pepSig(
  impute_na = TRUE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = TRUE)

prnMDS(
  impute_na = TRUE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = filpvals_impna.png, 
)

# analogous peptides
pepMDS(
  impute_na = TRUE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = filpvals_impna.png,
)

## show ellipses
prnMDS(
  show_ellipses = TRUE,
  col_group = Shape, 
  show_ids = FALSE,
  filename = ellipses_by_whims.png,
)

prnMDS(
  show_ellipses = TRUE,
  col_group = Color, 
  show_ids = FALSE,
  filename = ellipses_by_labs.png,
)

## a higher dimension
pepMDS(
  show_ids = FALSE,
  k = 5, 
  dimension = 3,
  filename = d3.pdf,
)

prnMDS(
  show_ids = TRUE,
  k = 4, 
  dimension = 3,
  filename = d3.png,
)

# show ellipses
# (column `expt_smry.xlsx::Color` codes `labs`.)
prnMDS(
  show_ids = FALSE,
  show_ellipses = TRUE,
  col_group = Color, 
  k = 4, 
  dimension = 3,
  filename = d3_labs.png,
)

# (column `expt_smry.xlsx::Shape` codes `WHIMs`.)
prnMDS(
  show_ids = FALSE,
  show_ellipses = TRUE,
  col_group = Shape, 
  k = 4, 
  dimension = 3,
  filename = d3_whims.png,
)


# toy example of finding samples(s) that are 
# most different in large fold changes;
prnMDS(
  show_ids = TRUE, 
  dist_co = log2(4),
  filename = where_are_the_large_diffs.png,
)


## custom theme
library(ggplot2)
my_mds_theme <- theme_bw() + theme(
  axis.text.x  = element_text(angle=0, vjust=0.5, size=16),
  axis.text.y  = element_text(angle=0, vjust=0.5, size=16),
  axis.title.x = element_text(colour="black", size=18),
  axis.title.y = element_text(colour="black", size=18),
  plot.title = element_text(face="bold", colour="black", size=20, hjust=0.5, vjust=0.5),
  
  panel.grid.major.x = element_blank(),
  panel.grid.minor.x = element_blank(),
  panel.grid.major.y = element_blank(),
  panel.grid.minor.y = element_blank(),
  
  legend.key = element_rect(colour = NA, fill = 'transparent'),
  legend.background = element_rect(colour = NA,  fill = "transparent"),
  legend.title = element_blank(),
  legend.text = element_text(colour="black", size=14),
  legend.text.align = 0,
  legend.box = NULL
)

pepMDS(
  impute_na = FALSE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  theme = my_mds_theme,
  filename = my_theme.png,
)

## direct uses of ggplot2
library(ggplot2)
res <- prnMDS(filename = foo.png)

p_fil <- ggplot(res, aes(x = Coordinate.1, y = Coordinate.2)) +
  geom_point(aes(colour = Color, shape = Shape, alpha = Alpha), size = 4, stroke = 0.02) + 
  scale_alpha_manual(values = c(.5, .9)) + 
  stat_ellipse(aes(fill = Shape), geom = "polygon", alpha = .4) + 
  guides(fill = FALSE) + 
  labs(title = "", x = "Coordinate 1", y = "Coordinate 2") +
  coord_fixed() 

ggsave(file.path(dat_dir, "Protein/MDS/my_ggplot2_fil.png"))

## Not run: 
prnMDS(
  col_color = "column_key_not_existed",
  col_shape = "another_missing_column_key"
)  

## End(Not run)

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnMDS: MDS plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

MDS plots

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnMDS in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

prnMDS: MDS plots In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

MDS plots

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnMDS in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnMDS: MDS plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data