README.md

UFO R Vectors

This package contains a collection of example implementations of lazily generated, larger-than-memory vectors for R using the R API of the UFO framework.

UFO R vectors are implemented with the Userfault Object (UFO) framework. These vectors are indistinguishable from plain old R vectors and can be used with existing R code. Moreover, unlike other R larger-than-memory frameworks which use S3/S4 objects to mimick vectors, UFO R vectors are also internally indistinguishable from ordinary R objects, making UFOs usable from within existing C and C++ code.

UFO R vectors are generated lazily: when an element of the vector is accessed, the UFO framework generates data for the vector using a populate function. This function can read the data from an existing source, like a CSV file, a binary file, or network storage, or generate the data on the fly, like a sequence.

In addition, UFO vectors are larger than memory. The vectors are internally split into chunks. When an element of the vector is accessed, only the chunk around that position is loaded into memory. When the loaded chunks in all UFO vectors take up too much room in process memory, the UFO framework unloads oldest chunks to reclaim memory. This happens in a way invisible to the programmer using the vectors, who can therefore operate on these vectors as if they fit into memory.

This package provides vector implementations for the following vectors:

Additionally, the package provides operators and helper functions for UFO vectors:

Warning: UFOs are under active development. Some bugs are to be expected, and some features are not yet fully implemented.

System requirements

Prerequisites

Check if your operating system restricts who can call userfaultfd:

cat /proc/sys/vm/unprivileged_userfaultfd

0 means only privileged users can call userfaultfd and UFOs will only work for privileged users. To allow unprivileged users to call userfaultfd:

sysctl -w vm.unprivileged_userfaultfd=1

Postgresql support:

libpq-dev

SQLite support:

libsqlite3-dev

Building

Before building, retrievew the code of a submodule:

git submodule update --init --recursive

To update the submodule, pull it.

cd submodules/ufo_r && git pull origin main && cd ../..

Install the package with R. This compiles and properly install the package.

R CMD INSTALL --preclean .

You can also build the project with debug symbols by setting (exporting) the UFO_DEBUG environmental variable to 1.

UFO_DEBUG=1 R CMD INSTALL --preclean .

Testing

Check the package and display all errors:

_R_CHECK_TESTS_NLINES_=0 R CMD check .

Troubleshooting

Operation not permitted

syscall/userfaultfd: Operation not permitted
error initializing User-Fault file descriptor: Invalid argument
Error: package or namespace load failed for ‘ufos’:
 .onLoad failed in loadNamespace() for 'ufos', details:
  call: .initialize()
  error: Error initializing the UFO framework (-1)

The user has insufficient privileges to execute a userfaultfd system call.

One likely culprit is that a global sysctl knob vm.unprivileged_userfaultfd to control whether userfaultfd is allowed by unprivileged users was added to kernel settings. If /proc/sys/vm/unprivileged_userfaultfd is 0, do:

sysctl -w vm.unprivileged_userfaultfd=1

Div_floor

error[E0599]: no method named `div_floor` found for type `usize` in the current scope
   --> /home/kondziu/.cargo/git/checkouts/ufo-core-6fe53746510c8ee1/853284b/src/ufo_objects.rs:193:47
    |
193 |         let chunk_number = offset_from_header.div_floor(bytes_loaded_at_once);
    |                                               ^^^^^^^^^ method not found in `usize`
    |
   ::: /home/kondziu/.cargo/registry/src/github.com-1ecc6299db9ec823/num-integer-0.1.44/src/lib.rs:54:8
    |
54  |     fn div_floor(&self, other: &Self) -> Self;
    |        --------- the method is available for `usize` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
1   | use num::Integer;
    |

You are using an older version of the Rust compiler. Consider upgrading:

rustup upgrade


ufo-org/ufo-r-vectors documentation built on Oct. 2, 2022, 11:09 p.m.