prnHist: Histogram visualization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

pepHist

R Documentation

Histogram visualization

Description

pepHist plots the histograms of peptide log2FC.

prnHist plots the histograms of protein log2FC.

Usage

pepHist(
  col_select = NULL,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  cut_points = c(mean_lint = NA),
  show_curves = TRUE,
  show_vline = TRUE,
  scale_y = TRUE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  ...
)

prnHist(
  col_select = NULL,
  scale_log2r = TRUE,
  complete_cases = FALSE,
  cut_points = c(mean_lint = NA),
  show_curves = TRUE,
  show_vline = TRUE,
  scale_y = TRUE,
  df = NULL,
  filepath = NULL,
  filename = NULL,
  theme = NULL,
  ...
)

Arguments

`col_select`	Character string to a column key in `expt_smry.xlsx`. At the `NULL` default, the column key of `Select` in `expt_smry.xlsx` will be used. In the case of no samples being specified under `Select`, the column key of `Sample_ID` will be used. The non-empty entries under the ascribing column will be used in indicated analysis.
`scale_log2r`	Logical; if TRUE, adjusts `log2FC` to the same scale of standard deviation across all samples. The default is TRUE. At `scale_log2r = NA`, the raw `log2FC` without normalization will be used.
`complete_cases`	Logical; if TRUE, only cases that are complete with no missing values will be used. The default is FALSE.
`cut_points`	A named, numeric vector defines the cut points (knots) in histograms. The default is `cut_points = c(mean_lint = NA)` where the cut points correspond to the quantile values under column `mean_lint` (mean log10 intensity) of input data. Values of `log2FC` will be then binned from `-Inf` to `Inf` according to the cut points. To disable data binning, set `cut_points = Inf` or `-Inf`. The binning of `log2FC` can also be achieved through a different numeric column, e.g., `cut_points = c(prot_icover = seq(.25, .75, .25))`. See also `mergePep` for data alignment with binning.
`show_curves`	Logical; if TRUE, shows the fitted curves. At the TRUE default, the curve parameters are based on the latest call to `standPep` or `standPrn` with `method_align = MGKernel`. This feature can inform the effects of data filtration on the alignment of `logFC` profiles. Also see `standPep` and `standPrn` for more examples.
`show_vline`	Logical; if TRUE, shows the vertical lines at `x = 0`. The default is TRUE.
`scale_y`	Logical; if TRUE, scale data on the `y-axis`. The default is TRUE.
`df`	The name of a primary data file. By default, it will be determined automatically after matching the types of data and analysis with an `id` among `c("pep_seq", "pep_seq_mod", "prot_acc", "gene")`. A primary file contains normalized peptide or protein data and is among `c("Peptide.txt", "Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`. For analyses require the fields of significance p-values, the `df` will be one of `c("Peptide_pVal.txt", "Peptide_impNA_pVal.txt", "Protein_pVal.txt", "protein_impNA_pVal.txt")`.
`filepath`	A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of `id` in the `call`.
`filename`	A representative file name to outputs. By default, the name(s) will be determined automatically. For text files, a typical file extension is `.txt`. For image files, they are typically saved via `ggsave` or `pheatmap` where the image type will be determined by the extension of the file name.
`theme`	A ggplot2 theme, i.e., theme_bw(), or a custom theme. At the NULL default, a system theme will be applied.
`...`	`filter_`: Variable argument statements for the row filtration of data against the column keys in `Peptide.txt` for peptides or `Protein.txt` for proteins. Each statement contains to a list of logical expression(s). The `lhs` needs to start with `filter_`. The logical condition(s) at the `rhs` needs to be enclosed in `exprs` with round parenthesis. For example, `pep_len` is a column key in `Peptide.txt`. The statement `filter_peps_at = exprs(pep_len <= 50)` will remove peptide entries with `pep_len > 50`. See also `normPSM`. Additional parameters for plotting with `ggplot2`: `xmin`, the minimum `x` at a log2 scale; the default is -2. `xmax`, the maximum `x` at a log2 scale; the default is +2. `xbreaks`, the breaks in `x`-axis at a log2 scale; the default is 1. `binwidth`, the binwidth of `log2FC`; the default is `(xmax - xmin)/80`. `ncol`, the number of columns; the default is 1. `width`, the width of plot; `height`, the height of plot. `scales`, should the scales be fixed across panels; the default is "fixed" and the alternative is "free".

Details

In the histograms, the log2FC under each TMT channel are color-coded by their contributing reporter-ion or LFQ intensity.

Value

Histograms of log2FC; raw histogram data: [...]_raw.txt; fitted data for curves: [...]_fitted.txt

Examples


# ===================================
# Histogram
# ===================================

## !!!require the brief working example in `?load_expts`

## examplary `MGKernel` alignment
standPep(
  method_align = MGKernel, 
  n_comp = 3, 
  seed = 749662, 
  maxit = 200, 
  epsilon = 1e-05, 
)

standPrn(
  method_align = MGKernel, 
  n_comp = 2, 
  seed = 749662, 
  maxit = 200, 
  epsilon = 1e-05, 
)

## (1) effects of data scaling
# peptide without log2FC scaling
pepHist(scale_log2r = FALSE)

# with scaling
pepHist(scale_log2r = TRUE)

## (2) sample column selection
# sample IDs indicated under column `Select` in `expt_smry.xlsx`
pepHist(col_select = Select, filename = colsel.png)

# protein data for samples under column `W2` in `expt_smry.xlsx`
prnHist(col_select = W2, filename = w2.png)

## (3) row filtration of data
# exclude oxidized methione or deamidated asparagine
pepHist(
  # filter_by = exprs(!grepl("[mn]", pep_seq_mod)),
  filter_by = exprs(not_contain_chars_in("mn", pep_seq_mod)),
  filename = "no_mn.png",
)

# phosphopeptide subset (error message if no matches)
pepHist(
  filter_peps = exprs(contain_chars_in("sty", pep_seq_mod)), 
  scale_y = FALSE, 
  filename = phospho.png,
)

# or use `grepl` directly
pepHist(
  filter_by = exprs(grepl("[sty]", pep_seq_mod)),
  filename = same_phospho.png,
)

## (4) between lead and lag
# leading profiles
pepHist(
  filename = lead.png,
)

# lagging profiles at
#   (1) n_psm >= 10
#   (2) and no methionine oxidation or asparagine deamidation
pepHist(
  filter_peps_by_npsm = exprs(pep_n_psm >= 10),
  filter_peps_by_mn = exprs(not_contain_chars_in("mn", pep_seq_mod)),
  filename = lag.png,
)

## (5) Data binning by `prot_icover`
pepHist(
  cut_points = c(prot_icover = NA),
  filename = prot_icover_coded.png,
)

## (6) custom theme
library(ggplot2)
my_histo_theme <- theme_bw() + theme(
  axis.text.x  = element_text(angle=0, vjust=0.5, size=18),
  axis.ticks.x  = element_blank(), # x-axis ticks
  axis.text.y  = element_text(angle=0, vjust=0.5, size=18),
  axis.title.x = element_text(colour="black", size=24),
  axis.title.y = element_text(colour="black", size=24),
  plot.title = element_text(colour="black", size=24, hjust=.5, vjust=.5),
  
  strip.text.x = element_text(size = 18, colour = "black", angle = 0),
  strip.text.y = element_text(size = 18, colour = "black", angle = 90),
  
  panel.grid.major.x = element_blank(),
  panel.grid.minor.x = element_blank(),
  panel.grid.major.y = element_blank(),
  panel.grid.minor.y = element_blank(),
  
  legend.key = element_rect(colour = NA, fill = 'transparent'),
  legend.background = element_rect(colour = NA,  fill = "transparent"),
  legend.title = element_blank(),
  legend.text = element_text(colour="black", size=18),
  legend.text.align = 0,
  legend.box = NULL
)

pepHist(
  theme = my_histo_theme,
  filename = my_theme.png,
)

pepHist(
  col_select = BI_1,
  theme = theme_dark(),
  filename = bi1_dark.png,
)


## (7) direct uses of ggplot2
library(ggplot2)
res <- pepHist(filename = default.png)

# names(res)

p <- ggplot() +
  geom_histogram(data = res$raw, aes(x = value, y = ..count.., fill = Int_index),
                 color = "white", alpha = .8, binwidth = .05, size = .1) +
  scale_fill_brewer(palette = "Spectral", direction = -1) +
  labs(title = "", x = expression("Ratio (" * log[2] * ")"), y = expression("Frequency")) +
  scale_x_continuous(limits = c(-2, 2), breaks = seq(-2, 2, by = 1),
                     labels = as.character(seq(-2, 2, by = 1))) +
  scale_y_continuous(limits = NULL) + 
  facet_wrap(~ Sample_ID, ncol = 5, scales = "fixed") # + 
  # my_histo_theme

p <- p + 
  geom_line(data = res$fitted, mapping = aes(x = x, y = value, colour = variable), size = .2) +
  scale_colour_manual(values = c("gray", "gray", "gray", "black"), name = "Gaussian",
                      breaks = c(c("G1", "G2", "G3"), paste(c("G1", "G2", "G3"), collapse = " + ")),
                      labels = c("G1", "G2", "G3", "G1 + G2 + G3"))

p <- p + geom_vline(xintercept = 0, size = .25, linetype = "dashed")

ggsave(file.path(dat_dir, "Peptide/Histogram/my_ggplot2.png"), 
       width = 22, height = 48, limitsize = FALSE)

## Not run: 
# sample selection
pepHist(
  col_select = "a_column_key_not_in_`expt_smry.xlsx`",
)

# data filtration
pepHist(
  filter_by = exprs(!grepl("[m]", a_column_key_not_in_data_table)),
)

prnHist(
  lhs_not_start_with_filter_ = exprs(n_psm >= 5),
)  

## End(Not run)

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnHist: Histogram visualization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Histogram visualization

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnHist in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

prnHist: Histogram visualization In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Histogram visualization

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to prnHist in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

prnHist: Histogram visualization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data