internalSubAsRest: Substitute values in a dataframe proportionally to all other...

Description Usage Arguments Details Value See Also Examples

View source: R/nbc4va_data.R

Description

Substitute a target value proportionally to the distribution of the rest of the values in a column, given the following conditions:

Usage

1
2
internalSubAsRest(dataset, x, cols = 1:ncol(dataset), ignore = c(NA, NaN),
  removal = FALSE)

Arguments

dataset

A dataframe with value(s) of x in it.

x

A target value in dataframe to replace with the rest of values per column.

cols

A numeric vector of columns to consider for substitution.

ignore

A vector of the rest of the values to ignore for substitution.

removal

Set to TRUE to remove column(s) that consist only of x values.

Details

Pseudocode of algorithm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
  SET dataset = table of values with columns and rows
  SET x = target value for substitution

  IF x in dataset:
    FOR EACH column y in a dataset:
      SET xv = all x values in y
      SET rest = all values not equal to x in y
      IF xv == values in y:
        REMOVE y in dataset
      IF number of unique values of rest == 1:
        MODIFY xv = rest
      IF number of xv values < number of unique values of rest:
        SET xn = number of xv values
        MODIFY xv = random sample of rest with size xn
      ELSE:
        SET xn = number of xv values
        SET p = proportions of rest
        SET xnp = xn * p
        IF xnp has decimals:
          MODIFY xnp = round xnp such that sum(xnp) == xn via largest remainder method
        MODIFY xv = rest values with distribution of xnp
  RETURN dataset

Value

out A dataframe or list depending on removal:

See Also

Other data functions: internalRoundFixedSum

Examples

1
2
3
4
library(nbc4va)
data(nbc4vaDataRaw)
unclean <- nbc4vaDataRaw
clean <- nbc4va:::internalSubAsRest(unclean, 99)

nbc4va documentation built on May 2, 2019, 1:42 p.m.