bigLm: fast and memory efficient linear model fitting

Description Usage Arguments Value Examples

View source: R/fast_big_lm.R

Description

fast and memory efficient linear model fitting

bigLm default

Usage

1
2
3
4
bigLm(X, ...)

## Default S3 method:
bigLm(X, y, method = 0L, gigs = 2, nslices = NULL, ...)

Arguments

X

input model matrix. must be a big.matrix object (type = 8 for double)

...

not used

y

numeric response vector of length nobs.

method

an integer scalar with value 0 for the LLT Cholesky or 1 for the LDLT Cholesky

gigs

double scalar. maximum number of gigs of memory available. Used to figure out how to break up calculations involving the design matrix X

nslices

integer scalar, defaults to NULL, which defers to the gigs argument to determine the number of slices required. If specified, nslices determines the number of slices to break up computation of X'X into.

Value

A list object with S3 class "bigLm" with the elements

coefficients

a vector of coefficients

se

a vector of the standard errors of the coefficient estimates

rank

a scalar denoting the computed rank of the model matrix

df.residual

a scalar denoting the degrees of freedom in the model

residuals

the vector of residuals

s

a numeric scalar - the root mean square for residuals

fitted.values

the vector of fitted values

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
library(bigmemory)

nrows <- 50000
ncols <- 50
bkFile <- "bigmat.bk"
descFile <- "bigmatk.desc"
bigmat <- filebacked.big.matrix(nrow=nrows, ncol=ncols, type="double",
                                backingfile=bkFile, backingpath=".",
                                descriptorfile=descFile,
                                dimnames=c(NULL,NULL))

# Each column value with be the column number multiplied by
# samples from a standard normal distribution.
set.seed(123)
for (i in 1:ncols) bigmat[,i] = rnorm(nrows)*i

y <- rnorm(nrows) + bigmat[,1]

system.time(lmr1 <- bigLm(bigmat, y))

system.time(lmr2 <- lm.fit(x = bigmat[,], y = y))

max(abs(coef(lmr1) - coef(lmr2)))

bigFastlm documentation built on May 30, 2017, 7:20 a.m.