chunk.ffdf: Chunk ff_vector and ffdf

View source: R/ffdf.R

chunk.ffdfR Documentation

Chunk ff_vector and ffdf

Description

Chunking method for ff_vector and ffdf objects (row-wise) automatically considering RAM requirements from recordsize as calculated from sum(.rambytes[vmode])

Usage

## S3 method for class 'ff_vector'
chunk(x
, RECORDBYTES = .rambytes[vmode(x)], BATCHBYTES = getOption("ffbatchbytes"), ...)
## S3 method for class 'ffdf'
chunk(x
, RECORDBYTES = sum(.rambytes[vmode(x)]), BATCHBYTES = getOption("ffbatchbytes"), ...)

Arguments

x

ff or ffdf

RECORDBYTES

optional integer scalar representing the bytes needed to process an element of the ff_vector a single row of the ffdf

BATCHBYTES

integer scalar limiting the number of bytes to be processed in one chunk, default from getOption("ffbatchbytes"), see also .rambytes

...

further arguments passed to chunk

Value

A list with ri indexes each representing one chunk

Author(s)

Jens Oehlschlägel

See Also

chunk, ffdf

Examples

  x <- data.frame(x=as.double(1:26), y=factor(letters), z=ordered(LETTERS), stringsAsFactors = TRUE)
  a <- as.ffdf(x)
  ceiling(26 / (300 %/% sum(.rambytes[vmode(a)])))
  chunk(a, BATCHBYTES=300)
  ceiling(13 / (100 %/% sum(.rambytes[vmode(a)])))
  chunk(a, from=1, to = 13, BATCHBYTES=100)
  rm(a); gc()

  message("dummy example for linear regression with biglm on ffdf")
  library(biglm)

  message("NOTE that . in formula requires calculating terms manually
    because . as a data-dependant term is not allowed in biglm")
  form <- Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species

  lmfit <- lm(form, data=iris)

  firis <- as.ffdf(iris)
  for (i in chunk(firis, by=50)){
    if (i[1]==1){
      message("first chunk is: ", i[[1]],":",i[[2]])
      biglmfit <- biglm(form, data=firis[i,,drop=FALSE])
    }else{
      message("next chunk is: ", i[[1]],":",i[[2]])
      biglmfit <- update(biglmfit, firis[i,,drop=FALSE])
    }
  }

  summary(lmfit)
  summary(biglmfit)
  stopifnot(all.equal(coef(lmfit), coef(biglmfit)))

ff documentation built on Sept. 30, 2024, 9:38 a.m.

Related to chunk.ffdf in ff...