View source: R/sketch_leverage.R
sketch_leverage | R Documentation |
Provides a subsample of data using sketches
sketch_leverage(data, m, method = "leverage")
data |
(n times d)-dimensional matrix of data. The first column needs to be a vector of the dependent variable (Y) |
m |
subsample size that is less than n |
method |
method for sketching: "leverage" leverage score sampling using X (default); "root_leverage" square-root leverage score sampling using X. |
An S3 object has the following elements.
subsample |
(m times d)-dimensional matrix of data |
prob |
m-dimensional vector of probabilities |
Ma, P., Zhang, X., Xing, X., Ma, J. and Mahoney, M.. (2020). Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1026-1035.
## Least squares: sketch and solve # setup n <- 1e+6 # full sample size d <- 5 # dimension of covariates m <- 1e+3 # sketch size # generate psuedo-data X <- matrix(stats::rnorm(n*d), nrow = n, ncol = d) beta <- matrix(rep(1,d), nrow = d, ncol = 1) eps <- matrix(stats::rnorm(n), nrow = n, ncol = 1) Y <- X %*% beta + eps intercept <- matrix(rep(1,n), nrow = n, ncol = 1) # full sample including the intercept term fullsample <- cbind(Y,intercept,X) # generate a sketch using leverage score sampling s_lev <- sketch_leverage(fullsample, m, "leverage") # solve without the intercept with weighting ls_lev <- lm(s_lev$subsample[,1] ~ s_lev$subsample[,2] - 1, weights = s_lev$prob)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.