lm.fit: Fitter for Linear Models

Description Usage Arguments Details Value Examples

Description

Fits a real linear model via QR with a "limited pivoting strategy", as in R's DQRDC2 (fortran).

Usage

1
2
## S4 method for signature 'ddmatrix,ddmatrix'
lm.fit(x, y, tol = 1e-07, singular.ok = TRUE)

Arguments

x, y

numeric distributed matrices

tol

tolerance for numerical rank estimation in QR decomposition.

singular.ok

logical. If FALSE then a singular model (rank-deficient x) produces an error.

Details

Solves the linear least squares problem, which is to find an x (possibly non-uniquely) such that || Ax - b ||^2 is minimized, where A is a given n-by-p model matrix, b is a "right hand side" n-by-1 vector (multiple right hand sides can be solved at once, but the solutions are independent, i.e. not simultaneous), and "||" is the l2 norm.

Uses level 3 PBLAS and ScaLAPACK routines (modified PDGELS) to get a linear least squares solution, using the 'limited pivoting strategy' from R's DQRDC2 (unsed in DQRLS) routine as a way of dealing with (possibly) rank deficient model matrices.

A model matrix with many dependent columns will likely experience poor performance, especially at scale, due to all the data swapping that must occur to handle rank deficiency.

Value

Returns a list of values similar to R's lm.fit(). Namely, the list contains:

coefficients (distributed matrix) solution to the linear least squares problem
residuals (distributed matrix) difference in the numerical fit and the observed
effects (distributed matrix) t(Q) %*% y
rank (global numeric) numerical column rank
fitted.values (distributed matrix) Numerical fit A %*% x
assign NULL if lm.fit() is called directly
qr list, same as return from qr()
df.residual (global numeric) degrees of freedom of residuals

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
spmd.code = "
  library(pbdDMAT, quiet = TRUE)
  init.grid()
  
  # don't do this in production code
  x <- matrix(rnorm(9), 3)
  y <- matrix(rnorm(3))
  
  dx <- as.ddmatrix(x)
  dy <- as.ddmatrix(y)
  
  fit <- lm.fit(x=dx, y=dy)
  fit
  
  finalize()
"

pbdMPI::execmpi(spmd.code = spmd.code, nranks=2L)

RBigData/pbdDMAT documentation built on Oct. 29, 2021, 6:20 p.m.