parallel.csv: Parallel CSV Converter
In Laurae2/LauraeDS: Laurae's Data Science Package

Description Usage Arguments Value Examples

Parallelizes the writing of separate CSV files (still sequential reading) in order to store them in fst format (also, overwrites fst::threads_fst. Requires data.table and fst packages.

parallel.csv(file, compress = 35, progress_bar = TRUE, clean_mem = FALSE,
  cl = NULL, max_threads = max(ifelse(is.null(cl), parallel::detectCores(),
  ifelse(!is.list(cl), round(parallel::detectCores()/cl),
  round(parallel::detectCores()/length(cl)))), 1), wkdir = NULL, ...)

`file`	Type: vector of characters. Path to all files to read.
`compress`	Type: numeric. Compression rate to use. Defaults to `35`.
`progress_bar`	Type: logical. Whether to print a progress bar. Defaults to `TRUE`.
`clean_mem`	Type: logical. Whether the force garbage collection at the end of each file read in order to reclaim RAM. Defaults to `FALSE`.
`cl`	Type: cluster or integer. A parallel cluster for parallelized calls. Used only when `progress_bar = TRUE`. Writes to the cluster most of the variables (`compress`, `max_threads`, `clean_mem`, `wkdir`) and removes them at the end. When it is a number, creates and destroys a cluster with the specified number of parallel clusters. Defaults to `NULL`.
`max_threads`	Type: numeric. The maximum number of threads allowed to adapt `fst::threads_fst`. Make sure the result of `cl` cores multiplicated by `max_threads` is not bigger than the number of threads in your computer. Defaults to `max(ifelse(is.null(cl), parallel::detectCores(), ifelse(!is.list(cl), round(parallel::detectCores() / cl), round(parallel::detectCores() / length(cl)))), 1)`, which means at least 1 thread, and adjust automatically the number of threads depending on the number of cores per cluster. Note that it takes the rounded value, which might over and under allocate threads.
`wkdir`	Type: character. The working directory, when using a cluster. Defaults to `NULL`.
`...`	Other arguments to pass to `fst::write.fst`.

The element or the list of fst file names.

## Not run: 
# Cannot pass CRAN checks. Disabled.
# Do it on your own files!
library(fst) # devtools::install_github("fstPackage/fst@e060e62")
library(data.table)
library(parallel)

parallel.csv(c("file_1.csv", "file_2.csv"), max_threads = 1, progress_bar = TRUE)
parallel.csv(paste0("file_", 1:100, ".csv"), max_threads = 1, progress_bar = TRUE, cl = 8)

## End(Not run)