# lm.fit.sparse: Fitter Function for Sparse Linear Models In MatrixModels: Modelling with Sparse and Dense Matrices

 lm.fit.sparse R Documentation

## Fitter Function for Sparse Linear Models

### Description

A basic computing engine for sparse linear least squares regression.

Note that the exact interface (arguments, return value) currently is experimental, and is bound to change. Use at your own risk!

### Usage

``````lm.fit.sparse(x, y, w = NULL, offset = NULL,
method = c("qr", "cholesky"),
tol = 1e-7, singular.ok = TRUE, order = NULL,
transpose = FALSE)
``````

### Arguments

 `x` sparse design matrix of dimension `n * p`, i.e., an R object of a `class` extending `dsparseMatrix`; typically the result of `sparse.model.matrix`. `y` vector of observations of length `n`, or a matrix with `n` rows. `w` vector of weights (length `n`) to be used in the fitting process. Weighted least squares is used with weights `w`, i.e., `sum(w * e^2)` is minimized. Not yet implemented ! `offset` numeric of length `n`). This can be used to specify an a priori known component to be included in the linear predictor during fitting. `method` a character string specifying the (factorization) method. Currently, `"qr"` or `"cholesky"`. `tol` [for back-compatibility only; unused:] tolerance for the `qr` decomposition. Default is 1e-7. `singular.ok` [for back-compatibility only; unused:] logical. If `FALSE`, a singular model is an error. `order` integer or `NULL`, for `method == "qr"`, will determine how the fill-reducing ordering (aka permutation) for the “symbolic” part is determined (in `cs_amd()`), with the options 0:natural, 1:Chol, 2:LU, and 3:QR, where `3` is the default. `transpose` logical; if true, use the transposed matrix `t(x)` instead of `x`.

### Value

Either a single numeric vector or a list of four numeric vectors.

### See Also

`glm4` is an alternative (much) more general fitting function.

`sparse.model.matrix` from the Matrix package; the non-sparse function in standard R's package stats: `lm.fit()`.

### Examples

``````dd <- expand.grid(a = as.factor(1:3),
b = as.factor(1:4),
c = as.factor(1:2),
d= as.factor(1:8))
n <- nrow(dd <- dd[rep(seq_len(nrow(dd)), each = 10), ])
set.seed(17)
dM <- cbind(dd, x = round(rnorm(n), 1))
## randomly drop some
n <- nrow(dM <- dM[- sample(n, 50),])
dM <- within(dM, { A <- c(2,5,10)[a]
B <- c(-10,-1, 3:4)[b]
C <- c(-8,8)[c]
D <- c(10*(-5:-2), 20*c(0, 3:5))[d]
Y <- A + B + A*B + C + D + A*D + C*x + rnorm(n)/10
wts <- sample(1:10, n, replace=TRUE)
rm(A,B,C,D)
})
str(dM) # 1870 x 7

X <- Matrix::sparse.model.matrix( ~ (a+b+c+d)^2 + c*x, data = dM)
dim(X) # 1870 x 69
X[1:10, 1:20]

## For now, use  'MatrixModels:::'  --- TODO : export once interface is clear!

Xd <- as(X,"matrix")
system.time(fmDense <- lm.fit(Xd, y = dM[,"Y"])) # {base} functionality
system.time( r1 <- MatrixModels:::lm.fit.sparse(X, y = dM[,"Y"]) ) # *is* faster
stopifnot(all.equal(r1, unname(fmDense\$coeff), tolerance = 1e-12))
system.time(
r2 <- MatrixModels:::lm.fit.sparse(X, y = dM[,"Y"], method = "chol") )
stopifnot(all.equal(r1, r2\$coef, tolerance = 1e-12),
all.equal(fmDense\$residuals, r2\$residuals, tolerance=1e-9)
)
## with weights:
system.time(fmD.w <- with(dM, lm.wfit(Xd, Y, w = wts)))
system.time(fm.w1 <- with(dM, MatrixModels:::lm.fit.sparse(X, Y, w = wts)))
system.time(fm.w2 <- with(dM, MatrixModels:::lm.fit.sparse(X, Y, w = wts,
method = "chol") ))
stopifnot(all.equal(fm.w1, unname(fmD.w\$coeff), tolerance = 1e-12),
all.equal(fm.w2\$coef, fm.w1, tolerance = 1e-12),
all.equal(fmD.w\$residuals, fm.w2\$residuals, tolerance=1e-9)
)
``````

MatrixModels documentation built on Nov. 6, 2023, 5:09 p.m.