RobustNormalization: RobustNormalization
In DataVisualizations: Visualizations of High-Dimensional Data

RobustNormalization

R Documentation

RobustNormalization

Description

RobustNormalization as described in [Milligan/Cooper, 1988].

Usage

RobustNormalization(Data,Centered=FALSE,Capped=FALSE,

na.rm=TRUE,WithBackTransformation=FALSE,

pmin=0.01,pmax=0.99)

Arguments

`Data`	[1:n,1:d] data matrix of n cases and d features
`Centered`	centered data around zero by median if TRUE
`Capped`	TRUE: outliers are capped above 1 or below -1 and set to 1 or -1.
`na.rm`	If TRUE, infinite vlaues are disregarded
`WithBackTransformation`	If in the case for forecasting with neural networks a backtransformation is required, this parameter can be set to 'TRUE'.
`pmin`	defines outliers on the lower end of scale
`pmax`	defines outliers on the higher end of scale

Details

Normalizes features either between -1 to 1 (Centered=TRUE) or 0-1 (Centered=TRUE) without changing the distribution of a feature itself. For a more precise description please read [Thrun, 2018, p.17].

"[The] scaling of the inputs determines the effective scaling of the weights in the last layer of a MLP with BP neural netowrk, it can have a large effect on the quality of the final solution. At the outset it is best to standardize all inputs to have mean zero and standard deviation 1 [(or at least the range under 1)]. This ensures all inputs are treated equally in the regularization prozess, and allows to choose a meaningful range for the random starting weights."[Friedman et al., 2012]

Value

if WithBackTransformation=FALSE: TransformedData[1:n,1:d] i.e., normalized data matrix of n cases and d features

if WithBackTransformation=TRUE: List with

`TransformedData`	[1:n,1:d] normalized data matrix of n cases and d features
`MinX`	[1:d] numerical vector used for manual back-transformation of each feature
`MaxX`	[1:d] numerical vector used for manual back-transformation of each feature
`Denom`	[1:d] numerical vector used for manual back-transformation of each feature
`Center`	[1:d] numerical vector used for manual back-transformation of each feature

Author(s)

Michael Thrun

References

[Milligan/Cooper, 1988] Milligan, G. W., & Cooper, M. C.: A study of standardization of variables in cluster analysis, Journal of Classification, Vol. 5(2), pp. 181-204. 1988.

[Friedman et al., 2012] Friedman, J., Hastie, T., & Tibshirani, R.: The Elements of Statistical Learning, (Second ed. Vol. 1), Springer series in statistics New York, NY, USA:, ISBN, 2012.

[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-658-20540-9")}, 2018.

Examples

Scaled = RobustNormalization(rnorm(1000, 2, 100), Capped = TRUE)
hist(Scaled)

m = cbind(c(1, 2, 3), c(2, 6, 4))
List = RobustNormalization(m, FALSE, FALSE, FALSE, TRUE)
TransformedData = List$TransformedData

mback = RobustNorm_BackTrafo(TransformedData, List$MinX, List$Denom, List$Center)

sum(m - mback)

DataVisualizations documentation built on April 3, 2025, 8:24 p.m.