Description Author(s) References Examples
This package can be used to calculate sketches of a data set that can be used to perform approximate classical or Bayesian linear regression. The sketch is a substitute data set of the same dimension but much smaller number of observations. The inference based on the sketch is much faster and is provably close to the exact inference. The calculation is done time- and space-efficiently in C. The two main functions are sketch
for data sets that fit into the working memory and can be processed at once and readinandsketch
for data sets that (potentially) do not fit into the working memory and will be read and sketched sequentially blockwise.
LN Geppert, K. Ickstadt, A. Munteanu, J. Quedenfeld, L. Sandig, C. Sohler
Geppert, L., Ickstadt, K., Munteanu, A., Quedenfeld, J., Sohler, C. (2017). Random projections for Bayesian regression. Statistics and Computing, 27(1), 79-101. doi:10.1007/s11222-015-9608-z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# create a small simulated data set
# with 400 observations and
# 4 variables
set.seed(23)
x1 = rnorm(400, 10, 2)
x2 = rnorm(400, 5, 3)
x3 = rnorm(400, -2, 1)
x4 = rnorm(400, 0, 5)
y = 2.4 - 0.6 * x1 + 5.5 * x2 - 7.2 * x3 + 5.7 * x4 + rnorm(400)
# all in one data.frame
data = data.frame(x1, x2, x3, x4, y)
# linear model based on original data set
lm(y ~ ., data = data)
# Calculate an RAD/"R"-sketch with epsilon = 0.2
s1 = sketch(data, epsilon = 0.2, method = 'R', affine = TRUE)
dim(s1)
# very similar results, intercept should be omitted
lm(y ~ . - 1, data = s1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.