View source: R/data_converters.R
train_test_filesystem | R Documentation |
Organise files into a train-test filesystem
train_test_filesystem(
path_to_files,
file_ext,
split = 0.8,
train_folder = "train",
test_folder = "test",
shuffle = TRUE,
overwrite = FALSE
)
path_to_files |
directory containing files |
file_ext |
file extension to filter |
split |
training data split |
train_folder |
name of training folder (subdirectory), will be created if does not exist |
test_folder |
name of testing folder (subdirectory), will be created if does not exist |
shuffle |
randomise files when splitting (if FALSE, files will be sorted by filename prior to splitting) |
overwrite |
force overwrite of files that already exist |
named vector of train and test directories
set.seed(123)
# create 10 random DNA files
tmp_dir <- tempdir()
# remove any existing .fna files
file.remove(
list.files(tmp_dir, pattern = "*.fna", full.names = TRUE)
)
for (i in 1:10) {
writeLines(paste0(">", i, "\n", paste0(sample(c("A", "T", "C", "G"),
100, replace = TRUE), collapse = "")), file.path(tmp_dir, paste0(i, ".fna")))
}
# split files into train and test directories
paths <- train_test_filesystem(tmp_dir,
file_ext = "fna",
split = 0.8,
shuffle = TRUE,
overwrite = TRUE)
list.files(paths[["train"]])
list.files(paths[["test"]])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.