msplit: Split a dataframe for training and testing sets

msplitR Documentation

Split a dataframe for training and testing sets

Description

This function splits automatically a dataframe into train and test datasets. You can define a seed to get the same results every time, but has a default value. You can prevent it from printing the split counter result.

Usage

msplit(df, size = 0.7, seed = 0, print = TRUE)

Arguments

df

Dataframe

size

Numeric. Split rate value, between 0 and 1. If set to 1, the train and test set will be the same.

seed

Integer. Seed for random split

print

Boolean. Print summary results?

Value

List with both datasets, summary, and split rate.

See Also

Other Machine Learning: ROC(), conf_mat(), export_results(), gain_lift(), h2o_automl(), h2o_predict_MOJO(), h2o_selectmodel(), impute(), iter_seeds(), lasso_vars(), model_metrics(), model_preprocess()

Other Tools: autoline(), bind_files(), bring_api(), chr2num(), db_download(), db_upload(), export_plot(), export_results(), files_functions(), font_exists(), formatColoured(), formatHTML(), get_credentials(), glued(), grepm(), h2o_selectmodel(), haveInternet(), image_metadata(), importxlsx(), ip_data(), json2vector(), list_cats(), listfiles(), mail_send(), markdown2df(), move_files(), myip(), quiet(), read.file(), statusbar(), tic(), try_require(), updateLares(), warnifnot(), what_size()

Examples

data(dft) # Titanic dataset
splits <- msplit(dft, size = 0.7, seed = 123)
names(splits)

lares documentation built on Sept. 13, 2024, 1:08 a.m.