disksort | R Documentation |
This function is designed to handle files larger than memory. At most
nrows
will be present in memory at once. It is not parallel.
For this to work efficiently it's necessary that the data between
breaks
fits into memory.
disksort(infile, outfile = NULL, sortcolumn = 1L, breaks = NULL, nrows = 1000L, nbins = 10L, read.table.args = NULL, write.table.args = NULL, cleanup = TRUE) streambin(infile, firstchunk, sortcolumn = 1L, breaks = NULL, nrows = 1000L, read.table.args = NULL)
infile |
unsorted file like object to read from. See |
outfile |
where to write the sorted file. See
|
sortcolumn |
which column of the data frame to sort on |
breaks |
vector giving points to split data for binning |
nrows |
number of rows in the data.frame held in memory |
nbins |
number of bins for bin sort. Ignored if |
read.table.args |
named list of extra arguments to read.table |
write.table.args |
named list of extra arguments to write.table. Defaults to using read.table.args to preserve the original formatting. |
cleanup |
remove intermediate files? |
firstchunk |
first rows from |
streambin
: Stream File Into Bins
Read a data frame, split it into bins, and write to those bins on disk.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.