normalize: Reversibly normalize a vector of values

Description Usage Arguments Details Value See Also Examples

View source: R/normalize.R

Description

Normalizes or standardizes a vector of values, recording the details of the transformation in attributes of the result so that the transformation can be reversed later on.

Usage

1
2
3
4
normalize(x, norm = "zscore", mu = base::mean(x, na.rm = TRUE),
  sigma = stats::sd(x, na.rm = TRUE), lambda = 0, gamma = 0,
  mu2 = base::mean(base::abs(x), na.rm = TRUE), lower = 0,
  upper = 100, clip = TRUE)

Arguments

x

A vector of values to be transformed.

norm

The type of normalization to apply. See Details for more information. Type names can be abbreviated.

mu

For zscore normalization, the mean. Defaults to the sample mean.

sigma

For zscore normalization, the standard deviation. Defaults to the sample standard deviation.

lambda

For the Box-Cox transform normalization, the exponent of the power transform. Defaults to zero.

gamma

An offset to the data for the Box-Cox transform normalization. Must be greater than minus the minimum value, otherwise NA values will result. Defaults to zero.

mu2

For scale normalization, the mean. Defaults to the sample mean of the absolute value of x.

lower

For range normalization, the lower bound of the input range. Defaults to 0.

upper

For range normalization, the upper bound of the input range. Defaults to 100.

clip

If TRUE (default), negative values will be clipped to zero for scale normalization, and values outside the input range will be clipped to the input range for range normalization.

Details

Six types of normalization are supported, chosen by the value of the parameter norm:

"zscore": subtract the mean (mu) and divide by the standard deviation (sigma). The result will have mean zero and unit standard deviation. Also known as normalizing residuals, calculating the z-score, centering and scaling, or normalizing the 2nd central moment. If the sample mean and sample variance are used (as in the default arguments), this is technically 'studentizing' the data.

"boxcox": apply the Box-Cox power transformation, which raises the data to a power and scales it appropriately such that it's continuous down to a power of 0, which becomes the log. This transformation will stabilize the variance for highly-skewed data.

"log": take the (natural) log of the data. Equivalent to boxcox with lambda=0 and gamma=0.

"scale": divide by the mean of the absolute value of the data (mu2). This normalization is appropriate for data where zero is a natural limit to the range of values.

"range": scale the data from the provided range [lower,upper] to the range [0,1].

"identity": passes the data through unchanged. This option is sometimes useful for testing and development.

Each normalization by default uses parameters based on the sample statistics of the input, but these parameters can be overridden.

Appropriate normalizations for various climate variables used in impacts analyis are as follows:

Value

Both normalize and denormalize return a vector of values. normalize adds attributes to the vector that record the type of normalization and the parameters used; denormalize removes these attributes from its output.

See Also

denormalize

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# z-score normalization
x <- rnorm(10000, mean=3, sd=2)
y <- normalize(x)
mplot(lapply(list(x,y),density), type="l")
str(y)

# Box-Cox normalization
x <- rgamma(10000, shape=3, rate=4)
y <- normalize(x, "boxcox")
mplot(lapply(list(x,y),density), type="l")
str(y)

sethmcg/climod documentation built on Nov. 19, 2021, 11:12 p.m.