combMeanCoef: Mean Coefficient Recombination

Description Usage Arguments Details Author(s) See Also Examples

Description

Mean coefficient recombination – Calculate the weighted average of parameter estimates for a model fit to each subset

Usage

1

Arguments

...

additional attributes to define the combiner (currently only used internally)

Details

combMeanCoef is passed to the argument combine in recombine

This method is designed to calculate the mean of each model coefficient, where the same model has been fit to subsets via a transformation. The mean is a weighted average of each coefficient, where the weights are the number of observations in each subset. In particular, drLM and drGLM functions should be used to add the transformation to the ddo that will be recombined using combMeanCoef.

Author(s)

Ryan Hafen

See Also

divide, recombine, rrDiv, combCollect, combDdo, combDdf, combRbind, combMean

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Create an irregular number of observations for each species
indexes <- sort(c(sample(1:50, 40), sample(51:100, 37), sample(101:150, 46)))
irisIrr <- iris[indexes,]

# Create a distributed data frame using the irregular iris data set
bySpecies <- divide(irisIrr, by = "Species")

# Fit a linear model of Sepal.Length vs. Sepal.Width for each species
# using 'drLM()' (or we could have used 'drGLM()' for a generlized linear model)
lmTrans <- function(x) drLM(Sepal.Length ~ Sepal.Width, data = x)
bySpeciesFit <- addTransform(bySpecies, lmTrans)

# Average the coefficients from the linear model fits of each species, weighted
# by the number of observations in each species
out1 <- recombine(bySpeciesFit, combine = combMeanCoef)
out1

# A more concise (and readable) way to do it
bySpecies %>%
  addTransform(lmTrans) %>%
  recombine(combMeanCoef)

# The following illustrates an equivalent, but more tedious approach
lmTrans2 <- function(x) t(c(coef(lm(Sepal.Length ~ Sepal.Width, data = x)), n = nrow(x)))
res <- recombine(addTransform(bySpecies, lmTrans2), combine = combRbind)
colnames(res) <- c("Species", "Intercept", "Sepal.Width", "n")
res
out2 <- c("(Intercept)" = with(res, sum(Intercept * n) / sum(n)),
          "Sepal.Width" = with(res, sum(Sepal.Width * n) / sum(n)))

# These are the same
identical(out1, out2)

datadr documentation built on May 1, 2019, 8:06 p.m.