View source: R/data_partition.R
| data_partition | R Documentation | 
Constructs an object of class data_partition.
data_partition(
  train,
  test,
  y,
  x = NULL,
  offset = NULL,
  weights = NULL,
  na_action = na.omit
)
| train | A  | 
| test | A  | 
| y | 
 | 
| x | (Optional)  | 
| offset | (Optional)  | 
| weights | (Optional)  | 
| na_action | 
 | 
A data_partition object is a list containing exactly two data frames
(train and test). This object will normally be constructed by
passing a single data frame to partition. Use this constructor
function in the event that you wish to manually link two independent data
sets: one to be used for model training and the other to be used for model
testing.
data_partition objects can be passed as the data argument to
the beset modeling functions (beset_lm,
beset_glm, and beset_elnet), in which case these
functions will train and cross-validate models using the train data
and append additional evaluation metrics using the test data. Note
that in earlier development versions, these functions provided an
optional test_data argument for this purpose. This has been removed
and you are now required to construct a data_partition object
beforehand because the data_partition constructor performs a number
of important checks to insure that your test data are compatible with
your train data: 1) all predictor and response variables are present
in both data sets, 2) the levels of all factor variables are the same for
both data sets, 3) if an offset variable is used for model training, an
offset variable is provided for predicting the test data, and 4) unless
na_action is set to na.pass, both data frames contain
complete cases with no missing data. The data_partition constructor
will alert you to potential issues, attempt to resolve them, and return an
error if it can't.
A data_partition object containing a train data frame
and a test data frame.
train <- mtcars[1:16,]
test <- mtcars[17:32,]
factor_names <- c("cyl", "vs", "am", "gear", "carb")
train[factor_names] <- purrr::map_dfc(train[factor_names], factor)
test[factor_names] <- purrr::map_dfc(test[factor_names], factor)
data <- data_partition(train, test, "mpg")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.