split_per_group: Split one variable into multiple vectors by group variable
In eliobartos/misc: Miscellaneous functions

Description Usage Arguments Value Author(s) Examples

View source: R/split_per_group.R

Selects variable and variable to group by and for each group returns a vector of values belonging to that group. Used to split selected variable by group because some functions for statistical tests as input receive two or more vectors and don't accept formula. In the process function also can drop percentage or some number of observations from each group (after grouping) or sets top x values to max of all other values (used for skewed distributions).

split_per_group(
  data,
  variable,
  split_variable = "ab_test_group",
  drop_pct = 0,
  drop_n = 0,
  set_top_pct = 0
)

`data`	Data frame containing variable of interest and grouping variable.
`variable`	(character) Variable we wish to split into multiple vectors.
`split_variable`	(character) Variable to split by, grouping variable.
`drop_pct`	Percentage of users to drop from top of each vector (after grouping).
`drop_n`	Number of users to rop from top of each vector (after grouping).
`set_top_pct`	Sets top set_top_pct values to 1-set_top_pct quantile of variable (before grouping). It is used to reduce big postive outliers in skewed distributions.

Named list of vectors belonging to each group.

Elio Bartoš

library(tibble)

df <- tribble(
  ~x, ~group,
   1, 1,
 3.2, 1,
 2.4, 1,
 3.1, 1,
   5, 2,
   6, 2,
 4.7, 2
)

split_per_group(df, "x", "group")
split_per_group(df, "x", "group", drop_n = 1) #Drops max value from each vector
split_per_group(df, "x", "group", set_top_pct = 0.15) #Value of 6 is reduced to 5.1