normMulGau | R Documentation |
Data normalization
Description
normMulGau
normalizes log2FC
.
Usage
normMulGau(
df,
method_align = "MC",
n_comp = NULL,
seed = NULL,
range_log2r = c(0, 100),
range_int = c(0, 100),
filepath = NULL,
col_select = NULL,
cut_points = Inf,
is_prot_lfq = FALSE,
...
)
Arguments
df |
An input data frame
|
method_align |
Character string indicating the method in aligning
log2FC across samples. MC : median-centering; MGKernel :
the kernel density defined by multiple Gaussian functions
(normalmixEM ). At the MC default, the ratio
profiles of each sample will be aligned in that the medians of the
log2FC are zero. At MGKernel , the ratio profiles of each
sample will be aligned in that the log2FC at the maximums of kernel
density are zero.
|
n_comp |
Integer; the number of Gaussian components to be used with
method_align = MGKernel . A typical value is 2 or 3. The variable
n_comp overwrites the argument k in
normalmixEM .
|
seed |
Integer; a seed for reproducible fitting at method_align =
MGKernel .
|
range_log2r |
Numeric vector at length two. The argument specifies the
range of the log2FC for use in the scaling normalization of standard
deviation across samples. The default is between the 10th and the 90th
quantiles.
|
range_int |
Numeric vector at length two. The argument specifies the
range of the intensity of reporter ions (including I000 ) for
use in the scaling normalization of standard deviation across samples. The
default is between the 5th and the 95th quantiles.
|
filepath |
A file path to output results. By default, it will be
determined automatically by the name of the calling function and the value
of id in the call .
|
col_select |
Character string to a column key in expt_smry.xlsx .
At the NULL default, the column key of Select in
expt_smry.xlsx will be used. In the case of no samples being
specified under Select , the column key of Sample_ID will be
used. The non-empty entries under the ascribing column will be used in
indicated analysis.
|
cut_points |
A named, numeric vector defines the cut points (knots) in
histograms. The default is cut_points = c(mean_lint = NA) where the
cut points correspond to the quantile values under column mean_lint
(mean log10 intensity) of input data. Values of log2FC will be then
binned from -Inf to Inf according to the cut points. To disable
data binning, set cut_points = Inf or -Inf . The binning of
log2FC can also be achieved through a different numeric column,
e.g., cut_points = c(prot_icover = seq(.25, .75, .25)) . See also
mergePep for data alignment with binning.
|
is_prot_lfq |
Logical; is protein LFQ data or not. About half of the
protein intensity values can be missing with LFQ and imputed with small
values. The typically causes a bimodality in protein log2FC distributions
and need to be handled especially at method_align = "MC" .
|
... |
filter_ : Variable argument statements for the row
filtration of data against the column keys in Peptide.txt for
peptides or Protein.txt for proteins. Each statement contains to a
list of logical expression(s). The lhs needs to start with
filter_ . The logical condition(s) at the rhs needs to be
enclosed in exprs with round parenthesis. For example,
pep_len is a column key in Peptide.txt . The statement
filter_peps_at = exprs(pep_len <= 50) will remove peptide entries
with pep_len > 50 . See also normPSM .
Additional parameters for plotting with ggplot2 : xmin ,
the minimum x at a log2 scale; the default is -2. xmax ,
the maximum x at a log2 scale; the default is +2. xbreaks ,
the breaks in x -axis at a log2 scale; the default is 1.
binwidth , the binwidth of log2FC ; the default is (xmax -
xmin)/80 . ncol , the number of columns; the default is 1.
width , the width of plot; height , the height of plot.
scales , should the scales be fixed across panels; the default is
"fixed" and the alternative is "free".
|
Details
When executed with mergePep or linkPep2Prn, the method_align
is always MC
. As a result, peptide or protein data are at first
median-centered.
It is then up to standPep or standPep for alternative choices
in method_align
, col_select
etc.
Value
A data frame.