RobustNormalization | R Documentation |
RobustNormalization as described in [Milligan/Cooper, 1988].
RobustNormalization(Data,Centered=FALSE,Capped=FALSE,
na.rm=TRUE,WithBackTransformation=FALSE,
pmin=0.01,pmax=0.99)
Data |
[1:n,1:d] data matrix of n cases and d features |
Centered |
centered data around zero by median if TRUE |
Capped |
TRUE: outliers are capped above 1 or below -1 and set to 1 or -1. |
na.rm |
If TRUE, infinite vlaues are disregarded |
WithBackTransformation |
If in the case for forecasting with neural networks a backtransformation is required, this parameter can be set to 'TRUE'. |
pmin |
defines outliers on the lower end of scale |
pmax |
defines outliers on the higher end of scale |
Normalizes features either between -1 to 1 (Centered=TRUE) or 0-1 (Centered=TRUE) without changing the distribution of a feature itself. For a more precise description please read [Thrun, 2018, p.17].
"[The] scaling of the inputs determines the effective scaling of the weights in the last layer of a MLP with BP neural netowrk, it can have a large effect on the quality of the final solution. At the outset it is best to standardize all inputs to have mean zero and standard deviation 1 [(or at least the range under 1)]. This ensures all inputs are treated equally in the regularization prozess, and allows to choose a meaningful range for the random starting weights."[Friedman et al., 2012]
if WithBackTransformation=FALSE
: TransformedData[1:n,1:d] i.e.,
normalized data matrix of n cases and d features
if WithBackTransformation=TRUE
: List with
TransformedData |
[1:n,1:d] normalized data matrix of n cases and d features |
MinX |
[1:d] numerical vector used for manual back-transformation of each feature |
MaxX |
[1:d] numerical vector used for manual back-transformation of each feature |
Denom |
[1:d] numerical vector used for manual back-transformation of each feature |
Center |
[1:d] numerical vector used for manual back-transformation of each feature |
Michael Thrun
[Milligan/Cooper, 1988] Milligan, G. W., & Cooper, M. C.: A study of standardization of variables in cluster analysis, Journal of Classification, Vol. 5(2), pp. 181-204. 1988.
[Friedman et al., 2012] Friedman, J., Hastie, T., & Tibshirani, R.: The Elements of Statistical Learning, (Second ed. Vol. 1), Springer series in statistics New York, NY, USA:, ISBN, 2012.
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-658-20540-9")}, 2018.
RobustNorm_BackTrafo
Scaled = RobustNormalization(rnorm(1000, 2, 100), Capped = TRUE)
hist(Scaled)
m = cbind(c(1, 2, 3), c(2, 6, 4))
List = RobustNormalization(m, FALSE, FALSE, FALSE, TRUE)
TransformedData = List$TransformedData
mback = RobustNorm_BackTrafo(TransformedData, List$MinX, List$Denom, List$Center)
sum(m - mback)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.