split_per_group: Split one variable into multiple vectors by group variable

Description Usage Arguments Value Author(s) Examples

View source: R/split_per_group.R

Description

Selects variable and variable to group by and for each group returns a vector of values belonging to that group. Used to split selected variable by group because some functions for statistical tests as input receive two or more vectors and don't accept formula. In the process function also can drop percentage or some number of observations from each group (after grouping) or sets top x values to max of all other values (used for skewed distributions).

Usage

1
2
3
4
5
6
7
8
split_per_group(
  data,
  variable,
  split_variable = "ab_test_group",
  drop_pct = 0,
  drop_n = 0,
  set_top_pct = 0
)

Arguments

data

Data frame containing variable of interest and grouping variable.

variable

(character) Variable we wish to split into multiple vectors.

split_variable

(character) Variable to split by, grouping variable.

drop_pct

Percentage of users to drop from top of each vector (after grouping).

drop_n

Number of users to rop from top of each vector (after grouping).

set_top_pct

Sets top set_top_pct values to 1-set_top_pct quantile of variable (before grouping). It is used to reduce big postive outliers in skewed distributions.

Value

Named list of vectors belonging to each group.

Author(s)

Elio Bartoš

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
library(tibble)

df <- tribble(
  ~x, ~group,
   1, 1,
 3.2, 1,
 2.4, 1,
 3.1, 1,
   5, 2,
   6, 2,
 4.7, 2
)

split_per_group(df, "x", "group")
split_per_group(df, "x", "group", drop_n = 1) #Drops max value from each vector
split_per_group(df, "x", "group", set_top_pct = 0.15) #Value of 6 is reduced to 5.1

eliobartos/misc documentation built on Oct. 8, 2021, 1:10 a.m.