Description Usage Arguments Details Value Examples
View source: R/subset_dataset.R
After creating a flagged dataset, and creating the model variables, use this function to create training datasets.
1 | subset_dataset(data, zeros_to_one = 3, seed = 42069)
|
data |
from create_model_variables |
zeros_to_one |
number of 0's to 1's, default is 3 0's to 1 1 |
seed |
number to set seed so randomization is consistent |
Since there are significantly more non-outliers to outliers (zeros to ones), it is nessicary to under-sample the zeros while training models.
The default is zeros_to_one=3, meaning there are 3 '0' observations for every '1' observation. Every '1' observation is kept, and the '0's are randomly sampled. The results are also shuffled.
dataframe
1 2 3 4 5 6 7 8 9 | # UT <- get_weather_data("UT", "D:/Data/ghcnd_all/")
# data <- create_flagged_dataset(UT)
# data_1 <- create_model_variables(data)
# subset <- subset_dataset(data_1)
# trainSize <- .5
# train <- subset[1:(trainSize*nrow(subset)), ]
# test <- subset[(trainSize*nrow(subset)):nrow(subset), ]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.