data_part: A function to partition a data frame

View source: R/data_partition.R View source: R/data_part.R

data_partR Documentation

A function to partition a data frame

Description

The function data_part() creates a index which partition randomly the observations of a data.frame into subsets. It creates randomly a factor of length n called partition, which subdivide the data i) into either two sets, training, train, and test, test, or ii) into three sets training, train, validation, val and test, test or iii) into a k-fold cross validation sets.

The function data_rm1val() removes from a data.frame variables which have only one value.

Usage

data_part(data, partition = 2L, probs, setseed = 123, ...)

data_rm1val(data)

Arguments

data

a data.frame

partition

2, 3 or a number less than 20

probs

probabilities for the random selection

setseed

setting the sead so the proccess can be repeated

...

extra arguments

Value

Both function produce a data frame. The function data_part() adds a factor partition while data_rm1val() removes variable with only one value.

Author(s)

Mikis Stasinopoulos, Bob Rigby and Fernanda De Bastiani

References

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

(see also https://www.gamlss.com/).

See Also

data_str

Examples

da <- data_part(rent)
head(da)
mosaicplot(table(da$partition))
da.train <- subset(da, da$partition=="train")
da.test <- subset(da, da$partition=="test")
dim(da.train)
dim(da.test)

gamlss.ggplots documentation built on Sept. 3, 2023, 5:08 p.m.