winsorize | R Documentation |
Removes extreme outliers using a winsorization transformation
winsorize(
x,
min.value = NULL,
max.value = NULL,
p = c(0.05, 0.95),
na.rm = FALSE
)
x |
A numeric vector |
min.value |
A fixed lower bounds, all values lower than this will be replaced by this value. The default is set to the 5th-quantile of x. |
max.value |
A fixed upper bounds, all values higher than this will be replaced by this value. The default is set to the 95th-quantile of x. |
p |
A numeric vector of 2 representing the probabilities used in the quantile function. |
na.rm |
(FALSE/TRUE) should NAs be omitted? |
Winsorization is the transformation of a distribution by limiting extreme values to reduce the effect of spurious outliers. This is done by shrinking outlying observations to the border of the main part of the distribution.
A transformed vector the same length as x, unless na.rm is TRUE, then x is length minus number of NA's
Jeffrey S. Evans <jeffrey_evans@tnc.org>
Dixon, W.J. (1960) Simplified Estimation from Censored Normal Samples. Annals of Mathematical Statistics. 31(2):385-391
set.seed(1234)
x <- rnorm(100)
x[1] <- x[1] * 10
winsorize(x)
plot(x, type="l", main="Winsorization transformation")
lines(winsorize(x), col="red", lwd=2)
legend("bottomright", legend=c("Original distribution","With outliers removed"),
lty=c(1,1), col=c("black","red"))
# Behavior with NA value(s)
x[4] <- NA
winsorize(x) # returns x with original NA's
winsorize(x, na.rm=TRUE) # removes NA's
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.