rbinder: rbinder

View source: R/rbinder.R

rbinderR Documentation

rbinder

Description

Batch read and unite multiple data files into a single data.frame.

Usage

rbinder(
  file.pattern,
  readf = read.csv2,
  path = ".",
  unique.field.name,
  result = c("default", "summary", "debug"),
  ...
)

Arguments

file.pattern

character; specifies regex pattern of file names to be processed.

readf

function; specifies function (which has to return a data.frame) used for reading in of data (defaults to read.csv2)

path

character; specifies directory from which to collect files corresponding to file.pattern (defaults to ".").

unique.field.name

character; column names designating unique entries that are not to be duplicated after uniting of files; use to avoid duplicates introduced by reading in of files with same content.

result

character; increases verbosity ("summary") or additionally persists intermediary results to file debug.rds ("debug").

...

arguments passed to function readf.

Value

a united data.frame with unique entries not duplicated despite possible multiple occurrence in files

See Also

df_pattern_subset() for subsetting a data.frame, useful for creating custom readers, see examples

Examples

# - Folder 'extdata' (system.file("extdata", package = "kungfu") contains three csv-files:
#> dir(system.file("extdata", package = "kungfu"), pattern = "csv")
#[1] "data01.csv" "data02.csv" "data03.csv"
#
# - To join them, run:
data_combined <- rbinder("^data", read.csv, path = system.file("extdata", package = "kungfu"), unique.field.name = "id")
#
# - You can also create your own custom "dirty reader" and supply that to "rbinder"
# - (Please see "?df_pattern_subset" for information regarding that function)
# - e.g.:
# my_dirty_excel_reader <- function(path) {
#    read_xlsx(path) %>%
#    df_pattern_subset("^mySpecial.*Pattern$", ignore_columns = TRUE) %>%
#    select(id, size, -timestamp, etc, anothercolumn)
# }

joheli/kungfu documentation built on March 25, 2024, 10:10 a.m.