knitr::opts_chunk$set(echo = TRUE)

In this vignette, you learn how to read a Filebacked Big Matrix from a text file. Package {bigreadr} is required.

Data

library(bigstatsr)
library(bigreadr)

## LONG CSV
df <- datasets::mtcars
csv <- fwrite2(df[rep(seq_len(nrow(df)), 500000), ], 
               tempfile(fileext = ".csv"), 
               row.names = TRUE)
format(file.size(csv), big.mark = ",")

Check file content

nlines(csv)
(first_rows <- fread2(csv, nrows = 5))
sapply(first_rows, typeof)
ncol(first_rows)

What you can see with these first lines:

Read those data

Read filtered data

## Get the filter data
filter <- fread2(csv, select = "cyl")[[1]] == 4
## Read only rows corresponding to 'filter'
(test2 <- big_read(csv, select = 2:12, filter = filter,
                   backingfile = tempfile()))
test2$is_saved
(rds <- test2$rds)

You need to read from the text file only once. To get the FBM object in another R session, just use big_attach():

(test3 <- big_attach(rds))


privefl/bigstatsr documentation built on March 29, 2024, 3:31 a.m.