normMulGau: Data normalization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

normMulGau

R Documentation

Data normalization

Description

normMulGau normalizes log2FC.

Usage

normMulGau(
  df,
  method_align = "MC",
  n_comp = NULL,
  seed = NULL,
  range_log2r = c(0, 100),
  range_int = c(0, 100),
  filepath = NULL,
  col_select = NULL,
  cut_points = Inf,
  is_prot_lfq = FALSE,
  ...
)

Arguments

`df`	An input data frame
`method_align`	Character string indicating the method in aligning `log2FC` across samples. `MC`: median-centering; `MGKernel`: the kernel density defined by multiple Gaussian functions (`normalmixEM`). At the `MC` default, the ratio profiles of each sample will be aligned in that the medians of the `log2FC` are zero. At `MGKernel`, the ratio profiles of each sample will be aligned in that the `log2FC` at the maximums of kernel density are zero.
`n_comp`	Integer; the number of Gaussian components to be used with `method_align = MGKernel`. A typical value is 2 or 3. The variable `n_comp` overwrites the argument `k` in `normalmixEM`.
`seed`	Integer; a seed for reproducible fitting at `method_align = MGKernel`.
`range_log2r`	Numeric vector at length two. The argument specifies the range of the `log2FC` for use in the scaling normalization of standard deviation across samples. The default is between the 10th and the 90th quantiles.
`range_int`	Numeric vector at length two. The argument specifies the range of the `intensity` of reporter ions (including `I000`) for use in the scaling normalization of standard deviation across samples. The default is between the 5th and the 95th quantiles.
`filepath`	A file path to output results. By default, it will be determined automatically by the name of the calling function and the value of `id` in the `call`.
`col_select`	Character string to a column key in `expt_smry.xlsx`. At the `NULL` default, the column key of `Select` in `expt_smry.xlsx` will be used. In the case of no samples being specified under `Select`, the column key of `Sample_ID` will be used. The non-empty entries under the ascribing column will be used in indicated analysis.
`cut_points`	A named, numeric vector defines the cut points (knots) in histograms. The default is `cut_points = c(mean_lint = NA)` where the cut points correspond to the quantile values under column `mean_lint` (mean log10 intensity) of input data. Values of `log2FC` will be then binned from `-Inf` to `Inf` according to the cut points. To disable data binning, set `cut_points = Inf` or `-Inf`. The binning of `log2FC` can also be achieved through a different numeric column, e.g., `cut_points = c(prot_icover = seq(.25, .75, .25))`. See also `mergePep` for data alignment with binning.
`is_prot_lfq`	Logical; is protein LFQ data or not. About half of the protein intensity values can be missing with LFQ and imputed with small values. The typically causes a bimodality in protein log2FC distributions and need to be handled especially at `method_align = "MC"`.
`...`	`filter_`: Variable argument statements for the row filtration of data against the column keys in `Peptide.txt` for peptides or `Protein.txt` for proteins. Each statement contains to a list of logical expression(s). The `lhs` needs to start with `filter_`. The logical condition(s) at the `rhs` needs to be enclosed in `exprs` with round parenthesis. For example, `pep_len` is a column key in `Peptide.txt`. The statement `filter_peps_at = exprs(pep_len <= 50)` will remove peptide entries with `pep_len > 50`. See also `normPSM`. Additional parameters for plotting with `ggplot2`: `xmin`, the minimum `x` at a log2 scale; the default is -2. `xmax`, the maximum `x` at a log2 scale; the default is +2. `xbreaks`, the breaks in `x`-axis at a log2 scale; the default is 1. `binwidth`, the binwidth of `log2FC`; the default is `(xmax - xmin)/80`. `ncol`, the number of columns; the default is 1. `width`, the width of plot; `height`, the height of plot. `scales`, should the scales be fixed across panels; the default is "fixed" and the alternative is "free".

Details

When executed with mergePep or linkPep2Prn, the method_align is always MC. As a result, peptide or protein data are at first median-centered.

It is then up to standPep or standPep for alternative choices in method_align, col_select etc.

Value

A data frame.

qzhang503/proteoQ documentation built on April 13, 2025, 8:31 a.m.

qzhang503/proteoQ index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

normMulGau: Data normalization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Data normalization

Description

Usage

Arguments

Details

Value

Related to normMulGau in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ Processing and Informatic Analysis of Mass Spectrometrirc Data

normMulGau: Data normalization In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data

Data normalization

Description

Usage

Arguments

Details

Value

Related to normMulGau in qzhang503/proteoQ...

R Package Documentation

Browse R Packages

We want your feedback!

qzhang503/proteoQ
Processing and Informatic Analysis of Mass Spectrometrirc Data

normMulGau: Data normalization
In qzhang503/proteoQ: Processing and Informatic Analysis of Mass Spectrometrirc Data