View source: R/data_partition.R View source: R/data_part.R
data_part | R Documentation |
The function data_part()
creates a index which partition randomly the observations of a data.frame
into subsets.
It creates randomly a factor of length n called partition
, which subdivide the data i) into either two sets, training, train
, and test, test
, or ii) into three sets training, train
, validation, val
and test, test
or iii) into a k-fold cross validation sets.
The function data_rm1val()
removes from a data.frame
variables which have only one value.
data_part(data, partition = 2L, probs, setseed = 123, ...)
data_rm1val(data)
data |
a |
partition |
2, 3 or a number less than 20 |
probs |
probabilities for the random selection |
setseed |
setting the sead so the proccess can be repeated |
... |
extra arguments |
Both function produce a data frame. The function data_part()
adds a factor partition
while data_rm1val()
removes variable with only one value.
Mikis Stasinopoulos, Bob Rigby and Fernanda De Bastiani
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
data_str
da <- data_part(rent)
head(da)
mosaicplot(table(da$partition))
da.train <- subset(da, da$partition=="train")
da.test <- subset(da, da$partition=="test")
dim(da.train)
dim(da.test)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.