rbindFiles | R Documentation |
Takes a sequence of files and combines them by rows, without reading the full files into memory. This is especially useful when dealing with large datasets, where the reading of entire files may be time consuming and require a large amount of memory.
rbindFiles(infiles, outfile, col.sep, header = FALSE, ask = TRUE,
verbose = FALSE, add.file.number = FALSE, blank.lines.skip = FALSE)
infiles |
A character vector of names (and paths) of the files to combine. |
outfile |
A character string giving the name of the modified file. The name of the file is relative to the current working directory, unless the file name contains a definite path. |
col.sep |
Specifies the separator used to split the columns in the files. To split at all types of spaces or blank characters, set |
header |
A logical variable which indicates if the first line in each file contains the names of the variables. If "TRUE", |
ask |
Logical. Default is "TRUE". If set to "FALSE", an already existing outfile will be overwritten without asking. |
verbose |
Logical. Default is "TRUE", which means that the line number is displayed for each iteration, i.e. each combined line. |
add.file.number |
A logical variable which equals "FALSE" by deafult. If "TRUE", an extra first column will be added to the outfile, consisting of the file numbers for each line. |
blank.lines.skip |
Logical. If "TRUE" (default), |
The function rbind
combines R objects by rows. However, reading large data files may require a large amount of memory and be extremely time consuming.
rbindFiles
avoids reading the full files into memory. It reads the files line by line, possibly modifies each line, then writes to outfile.
If however, header
, verbose
, add.file.number
and blank.lines.skip
are all set to "FALSE"
(their default values), the files are appended directly, thus evading line-by-line modifications.
In the case where infiles
contains only one file and no output or modifications are requested
(verbose
, add.file.number
and blank.lines.skip
equal "FALSE"), an identical copy of this file is made.
There is no useful output; the objective of rbindFiles
is to produce outfile
.
Combining the files by reading each file line by line is less time efficient than appending the files directly. For this reason, if header = FALSE
, changing the values of the logical variables verbose
, add.file.number
and blank.lines.skip
from "FALSE" to "TRUE" should not be done unless absolutely necessary.
Miriam Gjerdevik,
with Hakon K. Gjessing
Professor of Biostatistics
Division of Epidemiology
Norwegian Institute of Public Health
hakon.gjessing@uib.no
Web Site: https://haplin.bitbucket.io
cbindFiles
, lineByLine
## Not run:
# Combines the three infiles, by rows
rbindFiles(file.names = c("myfile1.txt", "myfile2.txt", "myfile3.txt"),
outfile = "myfile_combined_by_rows.txt", col.sep = " ", header = TRUE, verbose = TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.