ff: memory-efficient storage of large data on disk and fast access functions
Version 2.2-13

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

AuthorDaniel Adler <dadler@uni-goettingen.de>, Christian Glser <christian_glaeser@gmx.de>, Oleg Nenadic <onenadi@uni-goettingen.de>, Jens Oehlschlgel <Jens.Oehlschlaegel@truecluster.com>, Walter Zucchini <wzucchi@uni-goettingen.de>
Date of publication2014-04-09 09:54:20
MaintainerJens Oehlschlgel <Jens.Oehlschlaegel@truecluster.com>
LicenseGPL-2 | file LICENSE
Version2.2-13
URL http://ff.r-forge.r-project.org/
Package repositoryView on CRAN
InstallationInstall the latest version of this package by entering the following in R:
install.packages("ff")

Popular man pages

array2vector: Array: make vector from array
CFUN: Collapsing functions for batch processing
delete: Deleting the file behind an ff object
ffindexorder: Sorting: chunked ordering of integer suscript positions
finalizer: Get and set finalizer (name)
print.ff: Print and str methods
vw: Getting and setting virtual windows
See all...

All man pages Function index File listing

Man pages

add: Incrementing an ff or ram object
array2vector: Array: make vector from array
arrayIndex2vectorIndex: Array: make vector positions from array index
as.ff: Coercing ram to ff and ff to ram objects
as.ff.bit: Conversion between bit and ff boolean
as.ffdf: Coercing to ffdf and data.frame
as.hi: Hybrid Index, coercion to
as.integer.hi: Hybrid Index, coercing from
as.vmode: Coercing to virtual mode
bigsample: Sampling from large pools
CFUN: Collapsing functions for batch processing
chunk.bit: Chunk bit vectors
chunk.ffdf: Chunk ff_vector and ffdf
clone: Cloning ff and ram objects
clone.ffdf: Cloning ffdf objects
close.ff: Closing ff files
delete: Deleting the file behind an ff object
dim.ff: Getting and setting dim and dimorder
dimnames.ff_array: Getting and setting dimnames
dimnames.ffdf: Getting and setting dimnames of ffdf
dimorderCompatible: Test for dimorder compatibility
dummy.dimnames: Array: make dimnames
Extract.ff: Reading and writing vectors and arrays (high-level)
Extract.ffdf: Reading and writing data.frames (ffdf)
ff: ff classes for representing (large) atomic data
ffapply: Apply for ff objects
ffconform: Get most conforming argument
ffdf: ff class for data.frames
ffdfindexget: Reading and writing ffdf data.frame using ff subscripts
ffdfsort: Sorting: convenience wrappers for data.frames
ffdrop: Delete an ffarchive
ffindexget: Reading and writing ff vectors using ff subscripts
ffindexorder: Sorting: chunked ordering of integer suscript positions
ffinfo: Inspect content of ff saves
ffload: Reload ffSaved Datasets
fforder: Sorting: order from ff vectors
ffreturn: Return suitable ff object
ffsave: Save R and ff objects
ffsort: Sorting of ff vectors
ffsuitable: Test ff object for suitability
ffxtensions: Test for availability of ff extensions
filename: Get or set filename
file.resize: Change size of move an existing file
finalize: Call finalizer
finalizer: Get and set finalizer (name)
fixdiag: Test for fixed diagonal
Forbidden_ffdf: Forbidden ffdf functions
geterror.ff: Get error and error string
getpagesize: Get page size information
getset.ff: Reading and writing vectors of values (low-level)
hi: Hybrid index class
hiparse: Hybrid Index, parsing
Internal_ffdf: Internal ffdf functions
is.ff: Test for class ff
is.ffdf: Test for class ff
is.open: Test if object is opened
is.readonly: Get readonly status
is.sorted: Getting and setting 'is.sorted' physical attribute
length.ff: Getting and setting length
length.ffdf: Getting length of a ffdf dataframe
length.hi: Hybrid Index, querying
levels.ff: Getting and setting factor levels
LimWarn: ff Limitations and Warnings
matcomb: Array: make matrix indices from row and columns positions
matprint: Print beginning and end of big matrix
maxffmode: Lossless vmode coercability
maxlength: Get physical length of an ff or ram object
mismatch: Test for recycle mismatch
na.count: Getting and setting 'na.count' physical attribute
names.ff: Getting and setting names
nrowAssign: Assigning the number of rows or columns
open.ff: Opening an ff file
pagesize: Pagesize of ff object
physical.ff: Getting and setting physical and virtual attributes of ff...
physical.ffdf: Getting physical and virtual attributes of ffdf objects
print.ff: Print and str methods
ram2ffcode: Factor codings
ramattribs: Get ramclass and ramattribs
ramorder.default: Sorting: order R vector in-RAM and in-place
ramsort.default: Sorting: Sort R vector in-RAM and in-place
read.table.ffdf: Importing csv files into ff data.frames
readwrite.ff: Reading and writing vectors (low-level)
regtest.fforder: Sorting: regression tests
repnam: Replicate with names
sortLevels: Factor level manipulation
splitPathFile: Analyze pathfile-strings
swap: Reading and writing in one operation (high-level)
symmetric: Test for symmetric structure
symmIndex2vectorIndex: Array: make vector positions from symmetric array index
unclass_-: Unclassed assignement
undim: Undim
unsort: Hybrid Index, internal utilities
update.ff: Update ff content from another object
vecprint: Print beginning and end of big vector
vector2array: Array: make array from vector
vectorIndex2arrayIndex: Array: make array from index vector positions
vector.vmode: Create vector of virtual mode
vmode: Virtual storage mode
vmode.ffdf: Virtual storage mode of ffdf
vt: Virtual transpose
vw: Getting and setting virtual windows
write.table.ffdf: Exporting csv files from ff data.frames

Functions

Files

inst
inst/README_devel.txt
inst/ANNOUNCEMENT-2.1.2.txt
inst/ANNOUNCEMENT-2.2.txt
inst/ANNOUNCEMENT-2.0.txt
inst/ANNOUNCEMENT-2.1.txt
configure.ac
exec
exec/prebuild.sh
exec/make_rd.pl
src
src/Error.hpp
src/r_file_resize.h
src/r_ff_methodswitch.h
src/FileMapping.hpp
src/r_ff_addgetset.h
src/Win32FileMapping.cpp
src/utk_file_resize_ftruncate.hpp
src/utk_file_allocate_fseek.hpp
src/r_ff.c
src/MMapFileMapping.cpp
src/utk_config.hpp
src/ac_config.h
src/ac_config.h.in
src/config.h
src/types.hpp
src/utk_platform_macros.hpp
src/Array.hpp
src/FSInfo.hpp
src/Error.cpp
src/MMapFileMapping.hpp
src/utk_file_resize_win32.hpp
src/ff.cpp
src/r_ff.h
src/utk_file_resize.hpp
src/utk_file_resize.cpp
src/utk_file_allocate_fseek.cpp
src/ff.h
src/FSInfo_win32.cpp
src/Win32FileMapping.hpp
src/FSInfo_statfs.cpp
src/r_ff_makevmodes.h
src/r_ff_methoddeclaration.h
src/ordermerge.c
src/r_file_resize.cpp
NAMESPACE
NEWS
R
R/fileutil.R
R/ffcsv.R
R/ffsave.R
R/ffapply.R
R/ffreturn.R
R/generics.R
R/getpagesize.R
R/ffbit.R
R/as.ff.R
R/ffdf.R
R/ordermerge.R
R/bigsample.R
R/fileresize.R
R/CFUN.R
R/array.R
R/vt.R
R/fffactor.R
R/util.R
R/vmode.R
R/hi.R
R/zzz.R
R/ff.R
MD5
DESCRIPTION
configure
man
man/vector.vmode.rd
man/is.open.rd
man/close.ff.rd
man/ff.rd
man/ffindexget.rd
man/hiparse.rd
man/splitPathFile.rd
man/hi.rd
man/array2vector.rd
man/fixdiag.rd
man/symmetric.rd
man/as.ff.rd
man/sortLevels.rd
man/Extract.ff.rd
man/ffxtensions.rd
man/as.integer.hi.rd
man/as.vmode.rd
man/physical.ff.rd
man/Internal_ffdf.rd
man/ffdf.rd
man/vmode.rd
man/finalizer.rd
man/geterror.ff.rd
man/ffsave.rd
man/ffconform.rd
man/getpagesize.rd
man/ffinfo.rd
man/regtest.fforder.rd
man/ffsort.rd
man/dimorderCompatible.rd
man/ramattribs.rd
man/length.hi.rd
man/ffdfsort.rd
man/as.ff.bit.rd
man/print.ff.rd
man/add.rd
man/readwrite.ff.rd
man/matprint.rd
man/is.readonly.rd
man/write.table.ffdf.rd
man/ramorder.default.rd
man/nrowAssign.rd
man/dim.ff.rd
man/length.ff.rd
man/matcomb.rd
man/CFUN.rd
man/undim.rd
man/vw.rd
man/filename.rd
man/names.ff.rd
man/Extract.ffdf.rd
man/chunk.bit.rd
man/update.ff.rd
man/is.ff.rd
man/Forbidden_ffdf.rd
man/vmode.ffdf.rd
man/length.ffdf.rd
man/vectorIndex2arrayIndex.rd
man/physical.ffdf.rd
man/chunk.ffdf.rd
man/levels.ff.rd
man/bigsample.rd
man/swap.rd
man/dummy.dimnames.rd
man/ffdfindexget.rd
man/is.sorted.rd
man/maxlength.rd
man/file.resize.rd
man/vecprint.rd
man/symmIndex2vectorIndex.rd
man/na.count.rd
man/ramsort.default.rd
man/dimnames.ffdf.rd
man/as.hi.rd
man/pagesize.rd
man/ffload.rd
man/ffapply.rd
man/finalize.rd
man/clone.rd
man/mismatch.rd
man/read.table.ffdf.rd
man/delete.rd
man/LimWarn.rd
man/unsort.rd
man/vt.rd
man/arrayIndex2vectorIndex.rd
man/ffdrop.rd
man/getset.ff.rd
man/open.ff.rd
man/ffindexorder.rd
man/is.ffdf.rd
man/as.ffdf.rd
man/ffsuitable.rd
man/ram2ffcode.rd
man/clone.ffdf.rd
man/repnam.rd
man/vector2array.rd
man/maxffmode.rd
man/dimnames.ff_array.rd
man/unclass_-.rd
man/ffreturn.rd
man/fforder.rd
LICENSE
ff documentation built on May 19, 2017, 4:08 p.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.