Demo of the bit package

knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
require(bit)
.ff.version <- try(packageVersion("ff"), silent = TRUE)
.ff.is.available <- !inherits(.ff.version, "try-error") && .ff.version >= "4.0.0" && require(ff)
#tools::buildVignette("vignettes/bit-demo.Rmd")
#devtools::build_vignettes()

bit type

Create a huge boolean vector (no NAs allowed)

n <- 1e8
b1 <- bit(n)
b1

It costs only one bit per element

object.size(b1)/n

A couple of standard methods work

b1[10:30] <- TRUE
summary(b1)

Create a another boolean vector with TRUE in some different positions

b2 <- bit(n)
b2[20:40] <- TRUE
b2

fast boolean operations

b1 & b2

fast boolean operations

summary(b1 & b2)

bitwhich type

Since we have a very skewed distribution we may coerce to an even sparser representation

w1 <- as.bitwhich(b1) 
w2 <- as.bitwhich(b2)
object.size(w1)/n

and everything

w1 & w2

works as expected

summary(w1 & w2)

even mixing

summary(b1 & w2)

processing chunks

Many bit functions support a range restriction,

summary(b1, range=c(1,1000))

which is useful

as.which(b1, range=c(1, 1000))

for filtered chunked looping

lapply(chunk(from=1, to=n, length=10), function(i)as.which(b1, range=i))

over large ff vectors

options(ffbatchbytes=1024^3)
x <- ff(vmode="single", length=n)
x[1:1000] <- runif(1000)
lapply(chunk(x, length.out = 10), function(i)sum(x[as.hi(b1, range=i)]))

and wrap-up

delete(x)
rm(x, b1, b2, w1, w2, n)

for more info check the usage vignette



Try the bit package in your browser

Any scripts or data that you put into this service are public.

bit documentation built on Nov. 16, 2022, 1:12 a.m.