eda_untie: Resolve ties in a numeric vector

View source: R/eda_untie.R

eda_untieR Documentation

Resolve ties in a numeric vector

Description

Adjusts tied values in a numeric vector by adding or subtracting a small fraction of the range.

Usage

eda_untie(dat, x = NULL, fac = NULL, f = 0.01, rand = TRUE, ...)

Arguments

dat

A data frame or a numeric vector.

x

Numeric column. Ignored if dat is a numeric vector.

fac

Column of categorical values. Ignored if dat is a numeric vector.

f

A numeric value specifying the fraction of the range of x to use for perturbing tied values. Must be between 0 and 1.

rand

A logical value. If FALSE, all adjustments are of fixed size based on f. If TRUE, the adjustments are randomized within the range specified by f.

...

not used.

Details

The function identifies tied values in the input vector x and perturbs them slightly to break the ties. If rand = TRUE, the adjustment for each tied value is randomized uniformly with the lower and upper bounds defined by [0, f * diff(range(x))]. If rand = FALSE, the adjustment is deterministic and equal to +/- f * diff(range(x)). Alternating signs (-1 and 1) are used to distribute adjustments symmetrically. The deterministic approach may not eliminate all ties. For example, if four values are tied, the output will split the values into two tied values. Repeating the process on the output as needed will eliminate all remaining ties.

Value

Returns the input numeric data with ties resolved. If dat is a vector, a modified vector is returned. If dat is a data frame, a modified vector corresponding to the column specified by x is returned.

Examples

set.seed(42)
x <- c(1, 2, 2, 2, 3, 4, 4, 5)
# Randomized adjustments
x1 <- eda_untie(x, f = 0.01, rand = TRUE)
x1

# Deterministic adjustments. Given that there are three elements sharing the
# same value (a value of 2 in this example), the data will need to be
# processed twice.
x2 <- eda_untie(x, f = 0.01, rand = FALSE)
x2
x3 <- eda_untie(x2, f = 0.01, rand = FALSE)
x3

# Random adjustments. Add up to +/- 0.5 inches to singer height values
set.seed(17)
singer <- lattice::singer
factor <- 0.5 / diff(range(singer$height)) # Get fraction that covers 0.5 inches
eda_jitter(singer, height, voice.part)
singer$notie <- eda_untie(singer, height, voice.part, f = factor)
eda_jitter(singer, notie, voice.part)

mgimond/tukeyedar documentation built on Feb. 1, 2025, 4:02 a.m.