| Split | R Documentation |
Generates a list of length(tau) non-overlapping sets of observation
IDs.
Split(data, family = NULL, tau = c(0.5, 0.25, 0.25))
data |
vector or matrix of data. In regression, this should be the outcome data. |
family |
type of regression model. This argument is defined as in
|
tau |
vector of the proportion of observations in each of the sets. |
With categorical outcomes (i.e. family argument is set to
"binomial", "multinomial" or "cox"), the split is done
such that the proportion of observations from each of the categories in
each of the sets is representative of that of the full sample.
A list of length length(tau) with sets of non-overlapping
observation IDs.
# Splitting into 3 sets
simul <- SimulateRegression()
ids <- Split(data = simul$ydata)
lapply(ids, length)
# Balanced splits with respect to a binary variable
simul <- SimulateRegression(family = "binomial")
ids <- Split(data = simul$ydata, family = "binomial")
lapply(ids, FUN = function(x) {
table(simul$ydata[x, ])
})
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.