pt_winsor: Winsorization of outliers

View source: R/pt_winsor.R

pt_winsorR Documentation

Winsorization of outliers

Description

This function helps users to manage outliers at the indicator level by using winsorization techniques. See details for the different options.

Usage

pt_winsor(pt, id_var, z_val = 3.99, option = 3, w_var = NULL, delta = NULL)

Arguments

pt

A data frame consisting of the 'id_var' and relevant purchase task variables.

id_var

The name of the unique identifier (ID) as identified in the data frame.

z_val

The absolute z-score value to define outliers. The default is |3.99|, which will remove those exceeding a z-score of |3.99|.

option

The winsorization option, one of c(1,2,3). The default outlier management technique is option 3.

w_var

The name of the variable to winsorize.

delta

The constant used in winsorization options 2 and 3. The delta must be defined by the user, as the optimal value will vary depending on the indicator. For elasticity, a small value of 0.001 is recommended.

Details

There are 3 winsorization options:

i) Option 1 replaces all outliers with the theoretical value associated at the defined z-score threshold;

ii) Option 2 replaces all outliers with the observed minimum/ maximum non-outlying value plus (or minus) a small constant (delta);

iii) Option 3 replaces each outlier with the observed minimum/ maximum non-outlying value plus a small constant (delta) to retain order.

Value

A list consisting of two data frames: "data" which consists of the 'id_var' and 'pt' including the winsorized value(s); and "wins_table" which provides details on which value(s) by 'id_var' were winsorized (values before and after provided).

Examples

### --- Example Data
pt <- data.frame("ID" = c(1:36),
"Intensity" = c(10,12,15,0,99,11,7,6,12,7,8,10,5,6,10,0,3,
                7,5,0,2,3,5,6,10,15,12,7,0,9,0,6,7,8,4,5))

### --- Function Example
pt2 <- pt_winsor(pt, id_var = "ID", w_var = "Intensity", delta = 1)


PBCAR/PThelper documentation built on May 13, 2024, 3:45 p.m.