multistep_normalize: Normalize numeric data in multiple steps
In ParkerICI/kumquat: Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

This function normalizes numeric data in a table in multiple steps, based on groups defined by different combinations of categorical variables. The table is assumed to be in molten format (e.g. see reshape2::melt), with variable and value columns identifying observations. This function will then proceed to normalize the data based on the value identified by a categorical variable, then normalize the normalized data again by another value etc. (see below for Details) At each step of the normalization the table is grouped using the variable.var, subject.col and all the columns in names(norm.template). After this grouping, for every group, there can be only one row for the value of the current grouping variable that has been selected as a basis for normalization. In other words the function will not allow you to normalize a vector of values by another vector of values, it will only allow normalization of a vector by an individual number. This is done to prevent the result to depend on the ordering of the table.

multistep_normalize(
  tab,
  norm.template,
  subject.col,
  variable.var = "variable",
  value.var = "value",
  remove.normalization.baseline = TRUE
)

`tab`	The input `data.frame` See the details for assumption about its structure
`norm.template`	A named list identifying which categorical variables should be used to group data for normalization. The values in the list represent the value of the corresponding variable that identify the rows that are used as reference for normalization at each step. The data will be normalized in the same order specified by this list (i.e. data will be normalized according to the first variable, then again according to the second etc.)
`remove.normalization.baseline`	If `TRUE` the observations that have been used as baseline for normalization are removed at each normalization step
`subject.var`	The name of the column that identifies different subjects in `tab`. All normalization operations are done within the subgroups identified by this variable (i.e. data will never be normalized across subsets identified by different values of subject.var)

An example should help clarify the working of this function. Assume you have a dataset where different variables have been measured for multiple subjects, under different stimulation conditions, and at different timepoints. For each variable you want the data at each timepoint to be normalized by the value in the "unstim" condition. Then you want this data to be further normalized by the value at the "baseline" timepoint. Assume tab is in molten format and has the following columns

variable: identifies the variable
value: the corresponding value of the variable
timepoint: categorical variable that identifies the timepoint
condition: categorical variable that identified the condition
subject: categorical variable that identifies data belonging to the same subject (all the normalization is done within subject)

To achieve the result described above, you would invoke this function as multistep_normalize(tab, list(condition = "unstim", timepoint = "baseline"), "subject"). Note that the function would fail if you only specify a single variable (either condition or timepoint), because a single variable is not enough to identify a single value for normalization, since you have multiple conditions for each timepoint and viceversa.

ParkerICI/kumquat documentation built on Dec. 18, 2021, 6:40 a.m.

ParkerICI/kumquat index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ParkerICI/kumquat
Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

multistep_normalize: Normalize numeric data in multiple steps
In ParkerICI/kumquat: Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

Description

Usage

Arguments

Details

Related to multistep_normalize in ParkerICI/kumquat...

R Package Documentation

Browse R Packages

We want your feedback!

ParkerICI/kumquat Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

multistep_normalize: Normalize numeric data in multiple steps In ParkerICI/kumquat: Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

Description

Usage

Arguments

Details

Related to multistep_normalize in ParkerICI/kumquat...

R Package Documentation

Browse R Packages

We want your feedback!

ParkerICI/kumquat
Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)

multistep_normalize: Normalize numeric data in multiple steps
In ParkerICI/kumquat: Identification of stratifying subpopulations in Flow Cytometry Data (based on the Citrus package)