Split | R Documentation |
Generates a list of length(tau)
non-overlapping sets of observation
IDs.
Split(data, family = NULL, tau = c(0.5, 0.25, 0.25))
data |
vector or matrix of data. In regression, this should be the outcome data. |
family |
type of regression model. This argument is defined as in
|
tau |
vector of the proportion of observations in each of the sets. |
With categorical outcomes (i.e. family
argument is set to
"binomial"
, "multinomial"
or "cox"
), the split is done
such that the proportion of observations from each of the categories in
each of the sets is representative of that of the full sample.
A list of length length(tau)
with sets of non-overlapping
observation IDs.
# Splitting into 3 sets
simul <- SimulateRegression()
ids <- Split(data = simul$ydata)
lapply(ids, length)
# Balanced splits with respect to a binary variable
simul <- SimulateRegression(family = "binomial")
ids <- Split(data = simul$ydata, family = "binomial")
lapply(ids, FUN = function(x) {
table(simul$ydata[x, ])
})
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.