winsorize: Winsorize a numeric vector

Description Usage Arguments Examples

View source: R/winsorize.R

Description

Winsorize a numeric vector

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
winsorize(
  x,
  probs = NULL,
  cutpoints = NULL,
  replace = c(cutpoints[1], cutpoints[2]),
  verbose = TRUE
)

winsorise(
  x,
  probs = NULL,
  cutpoints = NULL,
  replace = c(cutpoints[1], cutpoints[2]),
  verbose = TRUE
)

Arguments

x

A vector of values

probs

A vector of probabilities that can be used instead of cutpoints. Quantiles are computed as the inverse of the empirical distribution function (type = 1)

cutpoints

Cutpoints under and above which are defined outliers. Default is (median - five times interquartile range, median + five times interquartile range). Compared to bottom and top percentile, this takes into account the whole distribution of the vector.

replace

Values by which outliers are replaced. Default to cutpoints. A frequent alternative is NA.

verbose

Boolean. Should the percentage of replaced values printed?

Examples

1
2
3
4
5
6
                         
v <- c(1:4, 99)
winsorize(v)
winsorize(v, replace = NA)
winsorize(v, probs = c(0.01, 0.99))
winsorize(v, cutpoints = c(1, 50))

Example output

0.00 % observations replaced at the bottom
20.00 % observations replaced at the top
[1]  1  2  3  4 13
0.00 % observations replaced at the bottom
20.00 % observations replaced at the top
[1]  1  2  3  4 NA
0.00 % observations replaced at the bottom
0.00 % observations replaced at the top
[1]  1  2  3  4 99
0.00 % observations replaced at the bottom
20.00 % observations replaced at the top
[1]  1  2  3  4 50

statar documentation built on Jan. 13, 2021, 9:33 p.m.