prune_fence: Identify the Rows Outside the Fence.

View source: R/prune_02-B_fence.R

prune_fenceR Documentation

Identify the Rows Outside the Fence.

Description

Identify the rows outside the fence.

Usage

prune_fence(data, cols, is_offset = TRUE, info = FALSE)

Arguments

data

Dataframe.

cols

Names of columns to apply func to. Must be a character vector with a minimum length of one.

is_offset

If TRUE (default) the offset number will be offset = min(min(x), min(y)), otherwise there will be not offset, that is offset = 0.

info

If FALSE (default) a logical vector with the is returned. If TRUE a list with the logical vector is returned as well as the slopes. This is used usually to help plot the fences.

Details

Compute the fence to eliminate values that are clearly out-of-bound. Normally all values should be non-negative. In case they are not, and offset is used. Also, sometimes the data is nowhere near zero and is such cases the fence is not useful, again in that case the offset solves that problem. The algorithm will generate an error when -Inf, Inf values are in the input. The NA are treated as being out-of-bound.

Value

If info = FALSE (default), logical vector where TRUE indicates values outside the fence. If info = TRUE, a list with the logical vector called is_outside, the list of slopes called slopes, the list of offsets called offsets, and the data.frame of fences data fences.

info

if the argument info is set to TRUE then a list with the following elements is given.

is_outside

Logical vector, TRUE is when the row is outside the fence limits, FALSE otherwise.

slopes

The slopes for the small and big fences.

offsets

The offset used to scale the x and y values.

fences

Dataframe with x = original x values; y = original y values, small = y value of the small fence on the scaled coordinates; big = y value of the big fence on the scaled coordinates; small_inv = y value of the small fence on the original scale (useful for plotting); big_inv = y value of the big fence on the original scale (useful for plotting.)

Source

Statistical Data Cleaning, Mark van der Loo and Edwin de Jonge, 2018. Section 7.5.2, p. 176-179.


FrankLef/eflMuncher documentation built on July 9, 2022, 11:43 a.m.