boxCoxEncode: boxCoxEncode

Description Usage Arguments Value

View source: R/encodings.R

Description

boxCoxEncode

Usage

1
2
boxCoxEncode(dt, vars, lambda = NULL, minNormalize = 0.05,
  capNegPredOutliers = 0)

Arguments

dt

Dataset to create object on.

vars

variables you want to include in the encoding.

lambda

You can pass custom lambdas if you want. Not recommended.

minNormalize

Box-Cox is a _risky_ transformation because it will fail if it encounters a number <= 0. You can reduce this _riskyness_ by adding a certain amount of 'space' between your expected range and 0. minNormalize represents the number of standard deviations you want between 0 and the minimum number (lower bound) in the distribution. This is set higher to ensure the variable never experiences a future number <= 0. Usually safe being set pretty low if you have lots of data. If you have done some engineering yourself to ensure this never happens, can be set to 0. All variables are automatically re-scaled, Can either be a scalar or a named list of values, with names equal to vars.

capNegPredOutliers

If you weren't careful enough with minNormalize and some negative values end up coming through, do you want to cap them before they hit boxCox, or throw an error? Safer to throw an error, so it's set to 0 by default. Then results in applyEncoding trying to perform boxCox on 0, which will fail. If not 0, this number represents the number of standard deviations above 0 that the numbers will be (min) capped at. Should be lower than minNormalize, or the results will no longer be in the same order, since negative values will now be greater than the minimum sample this encoding was created on.

Value

BoxCox Encoded Object


AnotherSamWilson/helperFuncs documentation built on Oct. 1, 2019, 8:51 p.m.