Description Usage Arguments Value Examples
View source: R/multilevel_break.R
The function iteratively learns which groups (see argument grouping_var) should at least be excluded from the data to reach a conservative 'goal value' for the statistic of interest. It does so by relying on a genetic algorithm, which efficiently explores the (usually vast) space of possible subsets. The result can uncover impactful subsamples and fuel discussions of robustness. Necessary arguments include the dataframe, a function to compute the statistic of interest ('statistic_computation' see examples), the column with the grouping variable ('grouping_var'), and the goal value of interest.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
data |
A data.frame containing the observations as rows. |
goal_value |
This conservative value (e.g., small effect size) is targeted. |
statistic_computation |
A formula which has 'data' as input and returns the statistic of interest. |
max_exclusions |
maximum number of groups to be excluded |
pop |
Number of 'individuals' in each generation of the genetic algorithm. |
max_generations |
Maximum number of generations that the algorithm generates. |
exclusion_cost |
Used to calibrate fitness function. |
prop_included_cases |
Initial proportion of included groups (e.g. .90). |
chance_of_mutation |
Chance that a gene mutates, higher is slower but more accurate (e.g. .02). |
stop_search |
After how many generations without change is the 'converged' result returned. |
random_seed |
Seed for replicability. |
Vector of zeros and ones with length equal to number of observations in data. Ones indicate exclusion.
1 2 3 4 5 6 7 8 9 10 11 12 | set.seed(42)
groups = c(0,0,0,0,0,1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,0,0,0,0,0,0,0,0,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,8,8)
v1 = rnorm(length(groups))+0.7
v2 = rnorm(length(groups))
df = as.data.frame(cbind(groups, v1,v2))
st = function(data){
t.test(data$v1, data$v2)$p.value
}
multilevel_break(df, statistic_computation = st, goal_value = 0.05, grouping_var = 'groups', max_exclusions = 8)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.