res_cal: Linear Regression Residuals Calculation

Description Usage Arguments Details Value Author(s) Examples

View source: R/variance_function.R

Description

res_cal calculates linear regression residuals in an efficient way : handling several dependent variables at a time, using Matrix::TsparseMatrix capabilities and allowing for pre-calculation of the matrix inverse.

Usage

1
res_cal(y = NULL, x, w = NULL, by = NULL, precalc = NULL, id = NULL)

Arguments

y

A (sparse) numerical matrix of dependent variable(s).

x

A (sparse) numerical matrix of independent variable(s).

w

An optional numerical vector of row weights.

by

An optional categorical vector (factor or character) when residuals calculation is to be conducted within by-groups (see Details).

precalc

A list of pre-calculated results (see Details).

id

A vector of identifiers of the units used in the calculation. Useful when precalc = TRUE in order to assess whether the ordering of the y data matrix matches the one used at the precalculation step.

Details

In the context of the gustave package, linear regression residual calculation is solely used to take into account the effect of calibration on variance estimation. Independent variables are therefore most likely to be the same from one variance estimation to another, hence the inversion of the matrix t(x) %*% Diagonal(x = w) %*% x can be done once and for all at a pre-calculation step.

The parameters y and precalc determine whether a list of pre-calculated data should be used in order to speed up the regression residuals computation at execution time:

The by parameter allows for calculation within by-groups : all calculation are made separately for each by-group (when calibration was conducted separately on several subsamples), but in an efficient way using Matrix::TsparseMatrix capabilities (especially when the matrix inverse is pre-calculated).

Value

Author(s)

Martin Chevalier

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Generating random data
set.seed(1)
n <- 100
H <- 5
y <- matrix(rnorm(2*n), nrow = n)
x <- matrix(rnorm(10*n), nrow = n)
by <- letters[sample(1:H, n, replace = TRUE)]

# Direct calculation
res_cal(y, x)

# Calculation with pre-calculated data
precalc <- res_cal(y = NULL, x)
res_cal(y, precalc = precalc)
identical(res_cal(y, x), res_cal(y, precalc = precalc))

# Matrix::TsparseMatrix capability
require(Matrix)
X <- as(x, "TsparseMatrix")
Y <- as(y, "TsparseMatrix")
identical(res_cal(y, x), as.matrix(res_cal(Y, X)))

# by parameter for within by-groups calculation
res_cal(Y, X, by = by)
all.equal(
 res_cal(Y, X, by = by)[by == "a", ],
  res_cal(Y[by == "a", ], X[by == "a", ])
)

gustave documentation built on Nov. 10, 2021, 5:08 p.m.