lineByLine | R Documentation |
Modifies a data file line by line, i.e. reads a file line by line, converts each line, then writes to the modified file. This method is especially useful when modifying large datasets, where the reading of entire files may be time consuming and require a large amount of memory.
lineByLine(infile, outfile, linefunc = identity, choose.lines = NULL,
choose.columns = NULL, col.sep = " ", ask = TRUE,
blank.lines.skip = TRUE, verbose = TRUE, ...)
infile |
A character string giving the name and path of the file to be modified. |
outfile |
A character string giving the name of the modified file. The name of the file is relative to the current working directory, unless the file name contains a definite path. |
linefunc |
|
choose.lines |
A numeric vector of lines to be selected or dropped from |
choose.columns |
A numeric vector of columns to be selected (positive values) or skipped (negative values) from |
col.sep |
Specifies the separator that splits the columns in |
ask |
Logical. Default is "TRUE". If set to "FALSE", an already existing outfile will be overwritten without asking. |
blank.lines.skip |
Logical. If "TRUE" (default), |
verbose |
Logical. Default is "TRUE", which means that the line number is displayed for each iteration, in addition to output from |
... |
Further arguments to be passed to |
When reading large datafiles, functions such as read.table
can use a large amount of memory and be extremely time consuming.
Instead of reading the entire file at once, lineByLine
reads one line at a time, modifies the line using linefunc
, and then writes the line to outfile
.
The user may specify his or her own line-converting function. This function must take the argument x
, a character vector representing a single line of the file, split at spaces. However, additional arguments may be included.
If verbose
equals "TRUE", output should be displayed.
The modified vector is returned.
The framework of the line-modifying function may look something like this:
lineModify <- function(x){ .xnew <- x ## Define any modifications, for instance recoding missing values in a dataset from NA to 0: .xnew[is.na(.xnew)] <- 0 ## Just to monitor progress, display, for instance, 10 first elements, without newline: cat(paste(.xnew[1:min(10, length(.xnew))], collapse = " ")) ## Return converted vector return(.xnew) }
See Haplin:::lineConvert
for an additional example of a line-modifying function.
lineByLine
returns the number of lines read, although invisible. The main objective is the modified file.
Miriam Gjerdevik,
with Hakon K. Gjessing
Professor of Biostatistics
Division of Epidemiology
Norwegian Institute of Public Health
Web Site: https://haplin.bitbucket.io
convertPed
## Not run:
## Extract the first ten columns from "myfile.txt",
## without reordering
lineByLine(infile = "myfile.txt", outfile = "myfile_modified.txt",
choose.columns = c(1:10))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.