Description Usage Arguments Details Value See Also Examples
Read a file chunk by chunk
1 | make.readchunk(input, FUN = identity, chunksize = 5000L)
|
input |
a length 1 character string. See Details. |
FUN |
any function applicated to each chunk |
chunksize |
number of lines for each chunk |
It creates a function that reads sucesive chunks of
the data referenced by input
usings the
fread
function. The input
is characterized
in the help page of fread
. The data contained in the
input
reference should not have any header.
This function is inspired by the bigglm
example.
A function with an logical argument, reset
. If this argument
is TRUE
, it indicates that the data should be reread from the
beginning by subsequent calls. When it reads all the data, it automatically
resets the file. This function returns the value of FUN
applied to
the chunk. By default, the chunk is returned as a
tbl_df
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ## Not run:
library(hflights)
nrow(hflights) # Number of rows
## We create a file with no header
input <- "hflights.csv"
write.table(hflights,file=input,sep=",",
row.names=FALSE,col.names=FALSE)
## Get the number of rows of each chunk
readchunk <- make.readchunk(input,FUN=function(x){NROW(x)})
a <- NULL
while(!is.null(b <- readchunk())) {
if(is.null(a)) {
a <- b
} else {
a <- a+b
}
}
all.equal(a, nrow(hflights))
## It resets automatically the file
a <- NULL
while(!is.null(b <- readchunk())) {
if(is.null(a)) {
a <- b
} else {
a <- a+b
}
}
all.equal(a, nrow(hflights))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.