adf: Create an abstract data frame

Description Usage Arguments Value Examples

Description

Create an abstract data frame

Usage

1
2
3
4
5
adf(description = "", conMethod = "file",
  encoding = getOption("encoding"), expand.path = FALSE,
  colClasses = NULL, colNames = NULL, skip = 0L,
  chunkProcessor = identity, chunkFormatter = NULL, sep = "|",
  strict = TRUE, header = FALSE, levels = list(), nrowsClasses = 250L)

Arguments

description

character string. A description of the connection; the path to a file for file connections.

conMethod

string indicating the connection method.

encoding

encoding to use in the connections

expand.path

logical. Should 'description' be normalized and wildcard expanded.

colClasses

an optional character vector of column classes. If named and colNames is missing, the names will be used for colNames. If missing, will be automatically determined.

colNames

an optional character vector of column names. If missing and header = TRUE, these will be determined be the first row of data. Otherwise, names will be constructed by pasting the character 'V' with the column number.

skip

number of lines to strip off of the file or connection before parsing

chunkProcessor

a function to apply post-processing to a formatted dataframe. Usually only used without a chunkFormatter.

chunkFormatter

an optional function turning the raw connection into a dataframe. It must accept four parameters: data, colNames, colClasses, levels. If missing this will be constructed automatically.

sep

character seperating the data columns. Ignored when chunkFormatter is given.

strict

logical. Whether the parser should run in strict mode. Ignored when chunkFormatter is given.

header

logical indicating whether the first line of data, after skip if provided, contains variable names. Ignored if chunkFormatter is also provided.

levels

a named list, with names corresponding to the colNames of class character (or factor). Each element gives the levels for the corresponding variable. These will be automatically determined when missing.

nrowsClasses

number of rows to pull for determining colClasses and factor levels when needed; will only be grabbed from the first file or connection if multiple are passed.

Value

An abstract data frame object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    n <- 100
    test_df <- data.frame(col1 = sample(state.abb,n,TRUE),
                          col2 = sample(1:10,n,TRUE),
                          col3 = runif(n),
                          col4 = complex(n,runif(n),runif(n)),
                          stringsAsFactors = FALSE)
    write.table(test_df, tf <- tempfile(), sep = "|",
                quote = FALSE, row.names = FALSE, col.names = FALSE)
    write.table(test_df, tf2 <- tempfile(), sep = "|",
                quote = FALSE, row.names = FALSE, col.names = FALSE)

    adfObj <- adf(c(tf,tf2))

    unlink(tf)
    unlink(tf2)

kaneplusplus/adf documentation built on May 28, 2019, 2:55 p.m.