Description Usage Arguments Details Value Examples
The function will read approximately p*nlines lines of a flat text
file. So if p=.1, then we will get roughly (probably not exactly)
10
readLines().
1 | sample_lines(file, n = -1L, p = 0.1, nskip = 0, nmax = 0, verbose = FALSE, ...)
|
file |
Location of the file (as a string) to be subsampled. |
n |
As in |
p |
Proportion to retain; should be a numeric value between 0 and 1. |
nskip |
Number of lines to skip. |
nmax |
Max number of lines to read. If nmax==0, then there is no read cap. |
verbose |
Logical; indicates whether or not linecounts of the input file and the number of lines sampled should be printed. |
... |
Additional arguments passed to |
This function scans over the test of the input file and at each step, randomly
chooses whether or not to include the current line into a downsampled file.
Each selected line is placed in a temporary file, before being read into R
via readLines(). Additional arguments to this function (those other
than file, p, and verbose) are passed to readLines(),
and so if their behavior is unclear, you should examine the readLines()
help file.
If verbose=TRUE, then something like:
Read 12207 lines (0.001%) of 12174948 line file.
will be printed to the terminal. This counts the header (if there is one) as one of the lines read and as one of the lines possible.
A character vector, as with readLines().
1 2 3 | library(filesampler)
file = system.file("rawdata/small.csv", package="filesampler")
sample_lines(file, p=.05)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.