dataSplit | R Documentation |
Split data into two different sets by a specific fraction. Splitting data is normally used to obtain a train and a validation set.
dataSplit(x, y, f = 3/4, type = "random")
x |
The input grid object. |
y |
The observations object. |
f |
Could be a fraction, value between (0,1) indicating the fraction of the data that will define the train set, or an integer indicating the number of folds. It can also be a list of folds indicating the years of each fold. |
type |
A string. Indicates if the splitting should be random (type = "random"), chronological (type = "chronological") or specified by the user (type = NULL). Default is "random". Default is "random". |
A list of folds containing the x and y splitted.
J. Bano-Medina
require(climate4R.datasets)
data("NCEP_Iberia_hus850", "NCEP_Iberia_psl", "NCEP_Iberia_ta850", "VALUE_Iberia_pr")
x <- makeMultiGrid(NCEP_Iberia_hus850, NCEP_Iberia_psl, NCEP_Iberia_ta850)
y <- VALUE_Iberia_pr
### Split the data in train and test (f < 1)###
data.splitted <- dataSplit(x,y,f = 3/4, type = "chronological")
str(data.splitted[[1]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[1]]$test$y$Dates) # 1 fold out of 3 for test
### Split the data in 3 folds ###
data.splitted <- dataSplit(x,y,f = 3, type = "chronological")
str(data.splitted[[1]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[1]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[2]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[2]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[3]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[3]]$test$y$Dates) # 1 fold out of 3 for test
data.splitted <- dataSplit(x,y,f = 3, type = "random")
str(data.splitted[[1]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[1]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[2]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[2]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[3]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[3]]$test$y$Dates) # 1 fold out of 3 for test
data.splitted <- dataSplit(x,y,type = "chronological",
f = list(c("1983","1984","1985","1986","1987",
"1988","1989","1990","1991"),
c("1992","1993","1994","1995","1996",
"1997","1998","1999"),
c("2000","2001","2002")))
str(data.splitted[[1]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[1]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[2]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[2]]$test$y$Dates) # 1 fold out of 3 for test
str(data.splitted[[3]]$train$y$Dates) # 2 folds out of 3 for train
str(data.splitted[[3]]$test$y$Dates) # 1 fold out of 3 for test
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.