Description Usage Arguments Value Author(s) Examples
View source: R/split_per_group.R
Selects variable and variable to group by and for each group returns a vector of values belonging to that group. Used to split selected variable by group because some functions for statistical tests as input receive two or more vectors and don't accept formula. In the process function also can drop percentage or some number of observations from each group (after grouping) or sets top x values to max of all other values (used for skewed distributions).
1 2 3 4 5 6 7 8 | split_per_group(
data,
variable,
split_variable = "ab_test_group",
drop_pct = 0,
drop_n = 0,
set_top_pct = 0
)
|
data |
Data frame containing variable of interest and grouping variable. |
variable |
(character) Variable we wish to split into multiple vectors. |
split_variable |
(character) Variable to split by, grouping variable. |
drop_pct |
Percentage of users to drop from top of each vector (after grouping). |
drop_n |
Number of users to rop from top of each vector (after grouping). |
set_top_pct |
Sets top set_top_pct values to 1-set_top_pct quantile of variable (before grouping). It is used to reduce big postive outliers in skewed distributions. |
Named list of vectors belonging to each group.
Elio Bartoš
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | library(tibble)
df <- tribble(
~x, ~group,
1, 1,
3.2, 1,
2.4, 1,
3.1, 1,
5, 2,
6, 2,
4.7, 2
)
split_per_group(df, "x", "group")
split_per_group(df, "x", "group", drop_n = 1) #Drops max value from each vector
split_per_group(df, "x", "group", set_top_pct = 0.15) #Value of 6 is reduced to 5.1
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.