Description Usage Arguments Value Note Author(s) Examples
Reads a text file in table format and creates a distributed data frame from it, with cases corresponding to lines and variables to fields in the file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## S3 method for class 'table'
drRead(file, header = FALSE, sep = "", quote = "\"'", dec = ".",
skip = 0, fill = !blank.lines.skip, blank.lines.skip = TRUE, comment.char = "#",
allowEscapes = FALSE, encoding = "unknown", autoColClasses = TRUE,
rowsPerBlock = 50000, postTransFn = identity, output = NULL, overwrite = FALSE,
params = NULL, packages = NULL, control = NULL, ...)
## S3 method for class 'csv'
drRead(file, header = TRUE, sep = ",",
quote = "\"", dec = ".", fill = TRUE, comment.char = "", ...)
## S3 method for class 'csv2'
drRead(file, header = TRUE, sep = ";",
quote = "\"", dec = ",", fill = TRUE, comment.char = "", ...)
## S3 method for class 'delim'
drRead(file, header = TRUE, sep = "\t",
quote = "\"", dec = ".", fill = TRUE, comment.char = "", ...)
## S3 method for class 'delim2'
drRead(file, header = TRUE, sep = "\t",
quote = "\"", dec = ",", fill = TRUE, comment.char = "", ...)
|
file |
input text file - can either be character string pointing to a file on local disk, or an |
header |
this and parameters other parameters below are passed to |
sep |
see |
quote |
see |
dec |
see |
skip |
see |
fill |
see |
blank.lines.skip |
see |
comment.char |
see |
allowEscapes |
see |
encoding |
see |
autoColClasses |
should column classes be determined automatically by reading in a sample? This can sometimes be problematic because of strange ways R handles quotes in |
rowsPerBlock |
how many rows of the input file should make up a block (key-value pair) of output? |
postTransFn |
a function to be applied after a block is read in to provide any additional processingn before the block is stored |
output |
a "kvConnection" object indicating where the output data should reside. Must be a |
overwrite |
logical; should existing output location be overwritten? (also can specify |
params |
a named list of objects external to the input data that are needed in |
packages |
a vector of R package names that contain functions used in |
control |
parameters specifying how the backend should handle things (most-likely parameters to |
... |
see |
an object of class "ddf"
For local disk, the file is actually read in sequentially instead of in parallel. This is because of possible performance issues when trying to read from the same disk in parallel.
Note that if skip
is positive and/or if header
is TRUE
, it will first read these in as they only occur once in the data, and we then check for these lines in each block and remove those lines if they appear.
Also note that if you supply "Factor"
column classes, they will be converted to character.
Ryan Hafen
1 2 3 4 5 6 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.