View source: R/balancedataset.R
balancedataset | R Documentation |
balance a data set according to some grouping factor(s)
balancedataset(xdata, whattobalance, n = NULL)
xdata |
a |
whattobalance |
a character vector with column names. The corresponding columns typically are either factor or character. |
n |
integer, the number of cases to select for each factor level (or combination of factor levels) |
the function requires either one or two factors to be balanced over
if n
is larger than the largest possible number, there will be a warning to that effect and n
will be reset to the largest possible number, i.e. the function behaves as if n = NULL
(the default)
a list with 5 items
$seldata
the subset of xdata
with the selected rows
$unseldata
the subset of xdata
with the rows that were not selected
$sel
the row indices of the selected rows
$unsel
the row indices of the rows not selected
$factors
the balance factor(s) (= whattobalance
)
Christof Neumann
set.seed(123) xdata <- data.frame(ID = sample(letters[1:4], 30, replace = TRUE), context = sample(LETTERS[21:22], 30, replace = TRUE), var1 = rnorm(30), var2 = rnorm(30)) table(xdata$ID, xdata$context) balancedataset(xdata = xdata, whattobalance = c("context"), n = 2)$seldata balancedataset(xdata = xdata, whattobalance = c("context"), n = 3)$seldata balancedataset(xdata = xdata, whattobalance = c("context"))$seldata # with two factors balancedataset(xdata = xdata, whattobalance = c("context", "ID"), n = 1)$seldata # one combination occurs only once (d/V): row 27 has to be in each data set table(xdata$ID, xdata$context) x <- sapply(1:50, function(X){ row.names(balancedataset(xdata = xdata, whattobalance = c("context", "ID"))$seldata) }) table(x)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.