# res_cal: Linear Regression Residuals Calculation In gustave: A User-Oriented Statistical Toolkit for Analytical Variance Estimation

 res_cal R Documentation

## Linear Regression Residuals Calculation

### Description

`res_cal` calculates linear regression residuals in an efficient way : handling several dependent variables at a time, using Matrix::TsparseMatrix capabilities and allowing for pre-calculation of the matrix inverse.

### Usage

```res_cal(y = NULL, x, w = NULL, by = NULL, precalc = NULL, id = NULL)
```

### Arguments

 `y` A (sparse) numerical matrix of dependent variable(s). `x` A (sparse) numerical matrix of independent variable(s). `w` An optional numerical vector of row weights. `by` An optional categorical vector (factor or character) when residuals calculation is to be conducted within by-groups (see Details). `precalc` A list of pre-calculated results (see Details). `id` A vector of identifiers of the units used in the calculation. Useful when `precalc = TRUE` in order to assess whether the ordering of the `y` data matrix matches the one used at the precalculation step.

### Details

In the context of the `gustave` package, linear regression residual calculation is solely used to take into account the effect of calibration on variance estimation. Independent variables are therefore most likely to be the same from one variance estimation to another, hence the inversion of the matrix `t(x) %*% Diagonal(x = w) %*% x` can be done once and for all at a pre-calculation step.

The parameters `y` and `precalc` determine whether a list of pre-calculated data should be used in order to speed up the regression residuals computation at execution time:

• if `y` not `NULL` and `precalc` `NULL` : on-the-fly calculation of the matrix inverse and the regression residuals (no pre-calculation).

• if `y` `NULL` and `precalc` `NULL` : pre-calculation of the matrix inverse which is stored in a list of pre-calculated data.

• if `y` not `NULL` and `precalc` not `NULL` : calculation of the regression residuals using the list of pre-calculated data.

The `by` parameter allows for calculation within by-groups : all calculation are made separately for each by-group (when calibration was conducted separately on several subsamples), but in an efficient way using Matrix::TsparseMatrix capabilities (especially when the matrix inverse is pre-calculated).

### Value

• if `y` is not `NULL` (calculation step) : a numerical matrix with same structure (regular base::matrix or Matrix::TsparseMatrix) and dimensions as `y`.

• if `y` is `NULL` (pre-calculation step) : a list containing pre-calculated data.

Martin Chevalier

### Examples

```# Generating random data
set.seed(1)
n <- 100
H <- 5
y <- matrix(rnorm(2*n), nrow = n)
x <- matrix(rnorm(10*n), nrow = n)
by <- letters[sample(1:H, n, replace = TRUE)]

# Direct calculation
res_cal(y, x)

# Calculation with pre-calculated data
precalc <- res_cal(y = NULL, x)
res_cal(y, precalc = precalc)
identical(res_cal(y, x), res_cal(y, precalc = precalc))

# Matrix::TsparseMatrix capability
require(Matrix)
X <- as(x, "TsparseMatrix")
Y <- as(y, "TsparseMatrix")
identical(res_cal(y, x), as.matrix(res_cal(Y, X)))

# by parameter for within by-groups calculation
res_cal(Y, X, by = by)
all.equal(
res_cal(Y, X, by = by)[by == "a", ],
res_cal(Y[by == "a", ], X[by == "a", ])
)

```

gustave documentation built on Sept. 19, 2022, 9:06 a.m.