View source: R/data_partition.R
data_partition | R Documentation |
Constructs an object of class data_partition
.
data_partition(
train,
test,
y,
x = NULL,
offset = NULL,
weights = NULL,
na_action = na.omit
)
train |
A |
test |
A |
y |
|
x |
(Optional) |
offset |
(Optional) |
weights |
(Optional) |
na_action |
|
A data_partition
object is a list containing exactly two data frames
(train
and test
). This object will normally be constructed by
passing a single data frame to partition
. Use this constructor
function in the event that you wish to manually link two independent data
sets: one to be used for model training and the other to be used for model
testing.
data_partition
objects can be passed as the data
argument to
the beset
modeling functions (beset_lm
,
beset_glm
, and beset_elnet
), in which case these
functions will train and cross-validate models using the train
data
and append additional evaluation metrics using the test
data. Note
that in earlier development versions, these functions provided an
optional test_data
argument for this purpose. This has been removed
and you are now required to construct a data_partition
object
beforehand because the data_partition
constructor performs a number
of important checks to insure that your test
data are compatible with
your train
data: 1) all predictor and response variables are present
in both data sets, 2) the levels of all factor variables are the same for
both data sets, 3) if an offset variable is used for model training, an
offset variable is provided for predicting the test data, and 4) unless
na_action
is set to na.pass
, both data frames contain
complete cases with no missing data. The data_partition
constructor
will alert you to potential issues, attempt to resolve them, and return an
error if it can't.
A data_partition
object containing a train
data frame
and a test
data frame.
train <- mtcars[1:16,]
test <- mtcars[17:32,]
factor_names <- c("cyl", "vs", "am", "gear", "carb")
train[factor_names] <- purrr::map_dfc(train[factor_names], factor)
test[factor_names] <- purrr::map_dfc(test[factor_names], factor)
data <- data_partition(train, test, "mpg")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.