Description Usage Arguments Value Details References Examples
View source: R/mrfrequentist.R
mrfrequentist
is used to conduct frequentist linear
regression on very large data sets using Merge and Reduce as
described in Geppert et al. (2020).
1 2 3 4 5 6 7 8 9 10 11 12 13 |
formula |
|
fileMr |
( |
dataMr |
( |
obsPerBlock |
|
approach |
|
sep |
See documentation of |
dec |
See documentation of |
header |
|
naStrings |
|
colNames |
|
naAction |
|
Returns an object of class "mrfrequentist"
which is a list
containing the following components for both approaches "1" and "3":
approach |
The approach used for merging the models. Either "1" or "3". |
formula |
The model's |
level |
Number of level of the final model in Merge and Reduce. This is equal to log2(ceiling(numberObs/obsPerBlock))+1 and corresponds to the number of buckets in Figure 1 of Geppert et al. (2020). |
numberObs |
The total number of observations. |
summaryStats |
Summary statistics reporting the estimated regression coefficients
and their unbiased standard errors. Estimates are based
on the merge technique as specified in the argument |
dataHead |
First six rows of the data in the first block. This serves
as a sanity check, especially when using the argument |
terms |
Terms object. |
Additionally for approach "3" only:
XTX |
The final model's |
yTX |
The final model's |
yTy |
The final model's |
In approach "3" the estimated regression coefficients and their unbiased standard errors
are calculated via qr decompositions on X'X (as in speedlm
with argument method = "qr"
). Moreover, the merge step uses the same
idea of blockwise addition for X'X, y'y and y'X as speedglm
's updating
procedure updateWithMoreData
. Conceptually though,
Merge and Reduce is not an updating algorithm as it merges models based on
a comparable amount of data along a tree structure to obtain a final model.
Geppert, L.N., Ickstadt, K., Munteanu, A., & Sohler, C. (2020).
Streaming statistical models via Merge & Reduce. International Journal
of Data Science and Analytics, 1-17,
doi: https://doi.org/10.1007/s41060-020-00226-0
1 2 3 4 5 6 7 8 9 | ## run mrfrequentist() with dataMr
data(exampleData)
fit1 = mrfrequentist(dataMr = exampleData, approach = "1", obsPerBlock = 300,
formula = V11 ~ .)
## run mrfrequentist() with fileMr
filepath = system.file("extdata", "exampleFile.txt", package = "mrregression")
fit2 = mrfrequentist(fileMr = filepath, approach = "3", header = TRUE,
obsPerBlock = 100, formula = y ~ .)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.