Memory Map Text File

Share:

Description

Reads a file column by column and creates a memory mapped object.

Usage

1
2
3
4
5
6
7
8
9
mmap.csv(file, 
         header = TRUE, 
         sep = ",", 
         quote = "\"", 
         dec = ".", 
         fill = TRUE, 
         comment.char = "", 
         row.names,
         ...)

Arguments

file

the name of the file containing the comma-separated values to be mapped.

header

does the file contain a header line?

sep

field separator character

quote

the set of quoting characters

dec

the character used for decimal points in the file

fill

unimplemented

comment.char

unimplemented

row.names

what it says

...

additional arguments

Details

mmap.csv is meant to be the analogue of read.csv in R, with the primary difference being that data is read, by column, into memory-mapped structs on disk. The intention is to allow for comma-separated files to be easily mapped into memory without having to load the entire object at once.

Value

An mmap object containing the data from the file. All types will be set to the equivelant type from mmap as would be in R from a call to read.csv.

Warning

At present the memory required to memory-map a csv file will be the memory required to load a single column from the file into R using the traditional read.table function. This may not be adequately efficient for extremely large data.

Note

This is currently a very simple implementation to facilitate exploration of the mmap package. While the interface will remain consistent with read.csv from utils, more additions to handle various out-of-core types available in mmap as well as performance optimization will be added.

Author(s)

Jeffrey A. Ryan

See Also

mmap, read.csv

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
data(cars)
tmp <- tempfile()
write.csv(cars, file=tmp, row.names=FALSE)

m <- mmap.csv(tmp)

colnames(m) <- colnames(cars)

m[]

extractFUN(m) <- as.data.frame  # coerce list to data frame upon subset

m[1:3,]

munmap(m)