Description Usage Arguments Details Value Examples
The function will read approximately p*nlines lines of a flat text
file. So if p=.1
, then we will get roughly (probably not exactly)
10
readLines()
.
1 | sample_lines(file, n = -1L, p = 0.1, nskip = 0, nmax = 0, verbose = FALSE, ...)
|
file |
Location of the file (as a string) to be subsampled. |
n |
As in |
p |
Proportion to retain; should be a numeric value between 0 and 1. |
nskip |
Number of lines to skip. |
nmax |
Max number of lines to read. If nmax==0, then there is no read cap. |
verbose |
Logical; indicates whether or not linecounts of the input file and the number of lines sampled should be printed. |
... |
Additional arguments passed to |
This function scans over the test of the input file and at each step, randomly
chooses whether or not to include the current line into a downsampled file.
Each selected line is placed in a temporary file, before being read into R
via readLines()
. Additional arguments to this function (those other
than file
, p
, and verbose
) are passed to readLines()
,
and so if their behavior is unclear, you should examine the readLines()
help file.
If verbose=TRUE
, then something like:
Read 12207 lines (0.001%) of 12174948 line file.
will be printed to the terminal. This counts the header (if there is one) as one of the lines read and as one of the lines possible.
A character vector, as with readLines()
.
1 2 3 | library(filesampler)
file = system.file("rawdata/small.csv", package="filesampler")
sample_lines(file, p=.05)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.