adf.apply: Apply a Function to an Abstract Data Frame

Description Usage Arguments Examples

Description

Low level function for applying over an abstract data frame.

Usage

1
2
3
4
adf.apply(x, FUN, args = list(), outDir = NULL, type = c("data.frame",
  "model", "sparse.model"), formula = NULL, contrasts = NULL,
  subset = NULL, weights = NULL, na.action = NULL, offset = NULL,
  params = list())

Arguments

x

an abstract data frame object

FUN

function to apply over each chunk; its first argument must accept the abstract data frame, and the second (optional) argument accepts the args parameter

args

Option list of arguments which are passed as a second argument to FUN

outDir

if 'NULL', the default, results are passed back to R; otherwise this gives the output location (a new directory) for storing the results

type

type of data to give as an input to FUN. If model or sparse model, this is a list giving the response (y), model matrix (x), weights (w), and offset (offset) from the input forumal.

formula

a formula to used with type equal to model or sparse.model

contrasts

contrasts to used with type equal to model or sparse.model

subset

a string to to used with type equal to model or sparse.model. Will be evaluated in the environment of the data frame (ex. subset = "V2 + V3 > V4")

weights

a string to to used with type equal to model or sparse.model. Will be evaluated in the environment of the data frame.

na.action

a function which indicates what should happen when the data contain 'NA's. See lm.fit for more details.

offset

a string to to used with type equal to model or sparse.model. Will be evaluated in the environment of the data frame.

params

a named list of additional parameters that depends on the type of abstract data frame that was created

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
n <- 100
test_df <- data.frame(col1 = sample(state.abb,n,TRUE),
                      col2 = sample(1:10,n,TRUE),
                      col3 = runif(n),
                      col4 = complex(n,runif(n),runif(n)),
                      stringsAsFactors = FALSE)
write.table(test_df, tf <- tempfile(), sep = "|",
            quote = FALSE, row.names = FALSE, col.names = FALSE)
write.table(test_df, tf2 <- tempfile(), sep = "|",
            quote = FALSE, row.names = FALSE, col.names = FALSE)

adfObj <- adf(c(tf,tf2))
adfObj <- allFactorLevels(adfObj)

# Construct OLS beta hat
adfObj <- adf(c(tf,tf2))
calcOLSmats <- function(u) list(XtX = t(u$x) %*% u$x, Xty = t(u$x) %*% u$y)
v <- adf.apply(adfObj, formula = "V3 ~ V2 + V1", calcOLSmats , 
               type = "model")
XtX <- Reduce(`+`, Map(getElement, v, "XtX"))
Xty <- Reduce(`+`, Map(getElement, v, "Xty"))

test_df2 <- rbind(test_df)
betaDF <- coef(lm(col3 ~ col2 + col1, data = test_df2))
betaADF <- qr.solve(XtX, Xty)
err <- max(abs(betaDF - betaADF))
err

unlink(tf)
unlink(tf2)

kaneplusplus/adf documentation built on May 28, 2019, 2:55 p.m.