read_sparse_csv: Read sparse (numeric) CSVs
In Laurae2/Laurae: Advanced High Performance Data Science Toolbox for R

Description Usage Arguments Value Examples

This function allows you to big load sparse numeric CSVs. Loading in chunks allows to not explode the memory as when the data is imported into R, it is typically a dense matrix. Verbosity is automatic and cannot be removed. In case you need this function without verbosity, please compile the package after removing verbose messages.

1 2	read_sparse_csv(input, iterfeature, nfeatures = NA, colClasses = NA, RDS = NA, compress_RDS = TRUE, NA_sparse = FALSE)

`input`	The input file name.
`iterfeature`	The amount of variables loaded per iteration. The smaller the longer it takes to load the whole dataset in its entireity.
`nfeatures`	The IDs of features to load. Defaults to `NA` which means loading all columns.
`colClasses`	The classes of the columns. Defaults to `NA` which means autoselection as numeric. Do not modify (keep default).
`RDS`	Whether to store in a RDS file of that name. Defaults to `NA` which means no RDS file. Otherwise, it takes `RDS` as filename.
`compress_RDS`	Whether to compress RDS file. Defaults to `TRUE`
`NA_sparse`	Whether sparsity is defined as NA. Defaults to `FALSE`

The sparse matrix

1 2	#read_sparse_csv("train_numeric.csv", iterfeature = 100, IDs = c(1:500, 601:1000), colClasses = NA, #RDS = TRUE, compress_RDS = FALSE, NA_sparse = FALSE)

Laurae2/Laurae documentation built on May 8, 2019, 7:59 p.m.