prnLDA: LDA plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

pepLDA

R Documentation

LDA plots

Description

pepLDA visualizes the linear discriminant analysis (LDA) of peptide log2FC.

prnLDA visualizes the linear discriminant analysis (LDA) of protein log2FC.

Usage

pepLDA(
  col_select = NULL,
  col_group = NULL,
  col_color = NULL,
  col_fill = NULL,
  col_shape = NULL,
  col_size = NULL,
  col_alpha = NULL,
  color_brewer = NULL,
  fill_brewer = NULL,
  size_manual = NULL,
  shape_manual = NULL,
  alpha_manual = NULL,
  choice = c("lda"),
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  center_features = TRUE,
  scale_features = TRUE,
  show_ids = TRUE,
  show_ellipses = FALSE,
  type = c("obs", "feats"),
  method = c("moment", "mle", "mve"),
  dimension = 2,
  folds = 1,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  formula = NULL,
  data = NULL,
  x = NULL,
  grouping = NULL,
  prior = NULL,
  subset = NULL,
  CV = NULL,
  na.action = NULL,
  nu = NULL,
  ...
)

prnLDA(
  col_select = NULL,
  col_group = NULL,
  col_color = NULL,
  col_fill = NULL,
  col_shape = NULL,
  col_size = NULL,
  col_alpha = NULL,
  color_brewer = NULL,
  fill_brewer = NULL,
  size_manual = NULL,
  shape_manual = NULL,
  alpha_manual = NULL,
  choice = c("lda"),
  scale_log2r = TRUE,
  complete_cases = FALSE,
  impute_na = FALSE,
  center_features = TRUE,
  scale_features = TRUE,
  show_ids = TRUE,
  show_ellipses = FALSE,
  type = c("obs", "feats"),
  method = c("moment", "mle", "mve"),
  dimension = 2,
  folds = 1,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  formula = NULL,
  data = NULL,
  x = NULL,
  grouping = NULL,
  prior = NULL,
  subset = NULL,
  CV = NULL,
  na.action = NULL,
  nu = NULL,
  ...
)

Arguments

`col_select`	Character string to a column key in `expt_smry.xlsx`. At the `NULL` default, the column key of `Select` in `expt_smry.xlsx` will be used. In the case of no samples being specified under `Select`, the column key of `Sample_ID` will be used. The non-empty entries under the ascribing column will be used in indicated analysis.
`col_group`	Character string to a column key in `expt_smry.xlsx`. Samples corresponding to non-empty entries under `col_group` will be used for sample grouping in the indicated analysis. At the NULL default, the column key `Group` will be used. No data annotation by groups will be performed if the fields under the indicated group column is empty.
`col_color`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `color` aesthetics in plots. At the NULL default, the column key `Color` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Color` and handle duplication in aesthetics).
`col_fill`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `fill` aesthetics in plots. At the NULL default, the column key `Fill` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Fill` and handle duplication in aesthetics).
`col_shape`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `shape` aesthetics in plots. At the NULL default, the column key `Shape` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Shape` and handle duplication in aesthetics).
`col_size`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `size` aesthetics in plots. At the NULL default, the column key `Size` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Size` and handle duplication in aesthetics).
`col_alpha`	Character string to a column key in `expt_smry.xlsx`. Values under which will be used for the `alpha` (transparency) aesthetics in plots. At the NULL default, the column key `Alpha` will be used. If NA, bypasses the aesthetics (a means to bypass the look-up of column `Alpha` and handle duplication in aesthetics).
`color_brewer`	Character string to the name of a color brewer for use in ggplot2::scale_color_brewer, i.e., `color_brewer = Set1`. At the NULL default, the setting in `ggplot2` will be used.
`fill_brewer`	Character string to the name of a color brewer for use in ggplot2::scale_fill_brewer, i.e., `fill_brewer = Spectral`. At the NULL default, the setting in `ggplot2` will be used.
`size_manual`	Numeric vector to the scale of sizes for use in ggplot2::scale_size_manual, i.e., `size_manual = c(8, 12)`. At the NULL default, the setting in `ggplot2` will be used.
`shape_manual`	Numeric vector to the scale of shape IDs for use in ggplot2::scale_shape_manual, i.e., `shape_manual = c(5, 15)`. At the NULL default, the setting in `ggplot2` will be used.
`alpha_manual`	Numeric vector to the scale of transparency of objects for use in ggplot2::scale_alpha_manual , i.e., `alpha_manual = c(.5, .9)`. At the NULL default, the setting in `ggplot2` will be used.
`choice`	Character string; the LDA method in one of `c("lda")`. The default is "lda".
`scale_log2r`	Logical; if TRUE, adjusts `log2FC` to the same scale of standard deviation across all samples. The default is TRUE. At `scale_log2r = NA`, the raw `log2FC` without normalization will be used.
`complete_cases`	Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.
`impute_na`	Logical; if TRUE, data with the imputation of missing values will be used. The default is FALSE.
`center_features`	Logical; if TRUE, adjusts log2FC to center zero by features (proteins or peptides). The default is TRUE. Note the difference to data alignment with `method_align` in `standPrn` or `standPep` where log2FC are aligned by observations (samples).
`scale_features`	Logical; if TRUE, adjusts log2FC to the same scale of variance by features (protein or peptide entries). The default is TRUE. Note the difference to data scaling with `scale_log2r` where log2FC are scaled by observations (samples).
`show_ids`	Logical; if TRUE, shows the sample IDs in `MDS/PCA` plots. The default is TRUE.
`show_ellipses`	Logical; if TRUE, shows the ellipses by sample groups according to `col_group`. The default is FALSE.
`type`	Character string indicating the type of PCA by either observations or features. At the `type = obs` default, observations (samples) are in rows and features (peptides or proteins) in columns for `prcomp`. The principal components are then plotted by observations. Alternatively at `type = feats`, features (peptides or proteins) are in rows and observations (samples) are in columns. The principal components are then plotted by features.
`method`	Dummy argument to avoid incurring the corresponding argument in dist by partial argument matches.
`dimension`	Numeric; The desired dimension for pairwise visualization. The default is 2.
`folds`	Not currently used. Integer; the degree of folding data into subsets. The default is one without data folding.
`df`	The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an `id` among `c("pep_seq", "pep_seq_mod", "prot_acc", "gene")`. A primary file contains normalized peptide or protein data and is among `c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`. For analyses require the fields of significance p-values, the `df` will be one of `c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`.
`filepath`	A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of `id` in the `call`.
`filename`	A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is `.txt`. For image files, they are typically saved via `ggsave` or `pheatmap` where the image type will be determined by the extension of the file name.
`theme`	A ggplot2 theme, i.e., theme_bw(), or a custom theme. At the NULL default, a system theme will be applied.
`formula`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`data`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`x`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`grouping`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`prior`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`subset`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`CV`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`na.action`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`nu`	Dummy argument to avoid incurring the corresponding argument in a pre-existed function by partial argument matches.
`...`	`filter_`: Variable argument statements for the row filtration against data in a primary file linked to `df`. See also `normPSM` for the format of `filter_` statements. `arrange_`: Variable argument statements for the row ordering against data in a primary file linked to `df`. See also `prnHM` for the format of `arrange_` statements. Additional parameters for `ggsave`: `width`, the width of plot; `height`, the height of plot `...`

Details

The utility is a wrapper of lda.

Value

LDA plots.

Examples


# ===================================
# LDA
# ===================================

## !!!require the brief working example in `?load_expts`

## global option
scale_log2r <- TRUE

# peptides, all samples
# (implicit `col_group = Group`)
pepLDA(
  col_select = Select, 
  filter_peps_by = exprs(pep_n_psm >= 3),
  show_ids = FALSE, 
  filename = "peps_rowfil.png",
)

# peptides, samples under column `BI`
pepLDA(
  col_select = BI, 
  col_shape = Shape,   
  col_color = Alpha, 
  filter_peps_by = exprs(pep_n_psm >= 10),
  show_ids = FALSE, 
  filename = "peps_rowfil_colsel.png",
)

# proteins
prnLDA(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filename = "prns_rowfil.png",
)

# subset by mean deviation values
# deviations to means may not be symmetric;
prnLDA(
  col_select = Select, 
  filter_peps_by = exprs(prot_mean_z >= -.25, prot_mean_z <= .3),
  show_ids = FALSE, 
  filename = "subset_by_mean_dev.png",
)

# proteins, custom palette
prnLDA(
  col_shape = Shape,
  color_brewer = Set1,
  show_ids = FALSE,
  filename = "my_palette.png",
)

## additional row filtration by pVals (proteins, impute_na = FALSE)
# if not yet, run prerequisitive significance tests at `impute_na = FALSE`
pepSig(
  impute_na = FALSE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = FALSE)

# (`W16_vs_W2.pVal (W16-W2)` now a column key)
prnLDA(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pvalcutoff.png, 
)

# analogous peptides
prnLDA(
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = pvalcutoff.png, 
)

## additional row filtration by pVals (proteins, impute_na = TRUE)
# if not yet, run prerequisitive NA imputation
pepImp(m = 2, maxit = 2)
prnImp(m = 5, maxit = 5)

# if not yet, run prerequisitive significance tests at `impute_na = TRUE`
pepSig(
  impute_na = TRUE, 
  W2_bat = ~ Term["(W2.BI.TMT2-W2.BI.TMT1)", 
                  "(W2.JHU.TMT2-W2.JHU.TMT1)", 
                  "(W2.PNNL.TMT2-W2.PNNL.TMT1)"],
  W2_loc = ~ Term_2["W2.BI-W2.JHU", 
                    "W2.BI-W2.PNNL", 
                    "W2.JHU-W2.PNNL"],
  W16_vs_W2 = ~ Term_3["W16-W2"], 
)

prnSig(impute_na = TRUE)

prnLDA(
  impute_na = TRUE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = filpvals_impna.png, 
)

# analogous peptides
pepLDA(
  impute_na = TRUE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  filename = filpvals_impna.png,
)

## a higher dimension
pepLDA(
  show_ids = FALSE,
  dimension = 3,
  filename = d3.pdf,
)

prnLDA(
  show_ids = TRUE,
  dimension = 3,
  filename = d3.png,
)

# show ellipses
# (column `expt_smry.xlsx::Color` codes `labs`.)
prnLDA(
  show_ids = FALSE,
  show_ellipses = TRUE,
  col_group = Color, 
  dimension = 3,
  filename = d3_labs.png,
)

# (column `expt_smry.xlsx::Shape` codes `WHIMs`.)
prnLDA(
  show_ids = FALSE,
  show_ellipses = TRUE,
  col_group = Shape, 
  dimension = 3,
  filename = d3_whims.png,
)

## custom theme
library(ggplot2)
my_theme <- theme_bw() + theme(
  axis.text.x  = element_text(angle=0, vjust=0.5, size=20),
  axis.text.y  = element_text(angle=0, vjust=0.5, size=20),
  axis.title.x = element_text(colour="black", size=20),
  axis.title.y = element_text(colour="black", size=20),
  plot.title = element_text(face="bold", colour="black", size=20, hjust=0.5, vjust=0.5),
  
  panel.grid.major.x = element_blank(),
  panel.grid.minor.x = element_blank(),
  panel.grid.major.y = element_blank(),
  panel.grid.minor.y = element_blank(),
  
  legend.key = element_rect(colour = NA, fill = 'transparent'),
  legend.background = element_rect(colour = NA,  fill = "transparent"),
  legend.title = element_blank(),
  legend.text = element_text(colour="black", size=14),
  legend.text.align = 0,
  legend.box = NULL
)

pepLDA(
  impute_na = TRUE,
  col_color = Color,
  col_shape = Shape,
  show_ids = FALSE,
  filter_peps_by = exprs(prot_n_pep >= 5),
  filter_by = exprs(`W16_vs_W2.pVal (W16-W2)` <= 1e-6), 
  theme = my_theme, 
  filename = my_theme.png,
)

## direct uses of ggplot2
library(ggplot2)
res <- prnLDA(filename = foo.png)

# names(res)

p <- ggplot(res$x, aes(x = LD1, y = LD2)) +
  geom_point(aes(colour = Color, shape = Shape, alpha = Alpha), size = 4, stroke = 0.02) + 
  stat_ellipse(aes(colour = Shape), linetype = 2) + 
  labs(title = "", 
       x = "LD1", 
       y = "LD2") +
  coord_fixed() + 
  geom_text(aes(label = Sample_ID), color = "gray", size = 1)

ggsave(file.path(dat_dir, "Protein/LDA/my_ggplot2.png"))

p_fil <- ggplot(res$x, aes(LD1, LD2)) +
  geom_point(aes(colour = Color, shape = Shape, alpha = Alpha), size = 4, stroke = 0.02) + 
  stat_ellipse(aes(fill = Shape), geom = "polygon", alpha = .4) + 
  labs(title = "", 
       x = "LD1", 
       y = "LD2") +
  coord_fixed() 

ggsave(file.path(dat_dir, "Protein/LDA/my_ggplot2_fil.png"))

## Not run: 
pepLDA(
  col_group = sample_ids_other_than_groups,
  col_select = Select, 
  filter_peps_by = exprs(pep_n_psm >= 3),
  show_ids = FALSE, 
  filename = "peps_rowfil.png",
)
  
# by features not available
prnLDA(
  type = feats,
  scale_log2r = TRUE,
  filename = "by_feats.png",
)

## End(Not run)

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnLDA: LDA plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

LDA plots

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnLDA in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

prnLDA: LDA plots In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

LDA plots

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnLDA in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnLDA: LDA plots
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data