create_groups: Create sets having minimal differences

View source: R/createGroups.r

create_groupsR Documentation

Create sets having minimal differences

Description

Is used to assign a set of items to N groups. Differences between groups are minimized with regard to specified criteria (E.g.: minimize differences in mean test scores between school classes).

Usage

create_groups(dat, criteria_scale = NULL, criteria_nominal = NULL, sets_n,
  repetitions = 100, exact = FALSE, tolerance_nominal = rep(Inf, 3),
  equalize = list(mean), write_file = FALSE, talk = TRUE)

Arguments

dat

A data.frame containing the set that is to be regrouped. All assignment criteria must be columns of this data.frame.

criteria_scale

A string vector naming all continuous, numerical columns in 'dat' that are to be considered as criteria in set assignment. Can be left out if only nominal variables are to be equalized between groups.

criteria_nominal

A string vector naming all nominal column variables in 'dat', that are to be considered as criteria in set assignment. Can be left out, if only continuous variables are to be equalized between groups. A maximum of two nominal criteria can be realized.

sets_n

How many equal groups are to be created.

repetitions

How many random assignments are to be tested. Only use if 'exact' == FALSE.

exact

Should _all_ possible assignments be tested? This yields the "optimal" solution given the assignment ciriteria and functions. Defaults to 'FALSE', in which case a random subset of all possible assignments will be tested.

tolerance_nominal

Use only if argument 'criteria_nominal' is also passed. This argument indicates the tolerated frequency deviations for nominal variables (and their combinations) between newly created sets. Must be a one-value vector if one nominal variable is passed; must be a three-value vector if two nominal variables are passed (the second value is the tolerance value for the second variable and the third value is the tolerance value for the combinations of both variables). It is possible that no assignment will be found that fits the tolerance requirements; if unsure how to use this parameter, start using large tolerance values and observe the group assigments.

equalize

A list of functions. These functions determine which criterion is minimized between sets: differences in function return values are minimized. The default function that is operated on is 'mean'; in this case, the mean values of the specified criteria (via argument 'criteria_scale') are matched between sets. Can be any function that returns a single value vector.

write_file

Boolean. Will newly found better fitting sets be written to a file automatically? (This is helpful if your simulation runs unexpectedly long and you need to kill it; in this case the best match is not lost). Defaults to 'FALSE'.

talk

Boolean. If 'TRUE', the function will print its progress.

Value

A data.frame. Contains all columns from argument 'dat' and additionally a column variable 'newSet'. This columns contains the set assigment of items to groups that produced the best fit in the previous iterations.

Author(s)

Martin Papenberg martin.papenberg@hhu.de


m-Py/minDiff documentation built on July 4, 2022, 3:58 p.m.