Description Usage Arguments Details Value Author(s) See Also Examples
this function splits the Data into Train and Test Sets, it returns a list containing two data frames for the train and test sets.
1 |
data |
object of class "data.frame" with target variable and predictor variables. |
seed |
integer. Values for the random number generator. Default: NULL. |
y |
character. Target variable. |
p |
numeric. Proportion of data to be used for training. |
... |
additional arguments to be passed to |
splitData
relies on the createDataPartition
function from the caret
package to perform the data split.
If y
is a factor, the sampling of observations for each set is done within the levels of y
such that the class distributions are more or less balanced for each set.
If y
is numeric, the data is split into groups based on percentiles and the sampling done within these subgroups. See createDataPartition
for more details on additional arguments that can be passed.
A list with two data frames: the first as train set, and the second as test set.
Zakaria Kehel, Bancy Ngatia
1 2 3 4 5 6 7 8 9 10 | if(interactive()){
# Split the data into 70/30 train and test sets for factor y
data(septoriaDurumWC)
split.data <- splitData(septoriaDurumWC, seed = 1234,
y = 'ST_S', p = 0.7)
# Get training and test sets from list object returned
trainset <- split.data$trainset
testset <- split.data$testset
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.