# pcsslm: Approximate a linear model using PCSS In jackmwolf/pcsstools: Tools for Regression Using Pre-Computed Summary Statistics

 pcsslm R Documentation

## Approximate a linear model using PCSS

### Description

`pcsslm` approximates a linear model of a combination of variables using precomputed summary statistics.

### Usage

``````pcsslm(formula, pcss = list(), ...)
``````

### Arguments

 `formula` an object of class formula whose dependent variable is a combination of variables and logical | operators. All model terms must have appropriate PCSS in `pcss`. `pcss` a list of precomputed summary statistics. In all cases, this should include `n`: the sample size, `means`: a named vector of predictor and response means, and `covs`: a named covariance matrix including all predictors and responses. See Details for more information. `...` additional arguments. See Details for more information.

### Details

`pcsslm` parses the input `formula`'s dependent variable for functions such as sums (`+`), products (`*`), or logical operators (`|` and `&`). It then identifies models the combination of variables using one of `model_combo`, `model_product`, `model_or`, `model_and`, or `model_prcomp`.

Different precomputed summary statistics are needed inside `pcss` depending on the function that combines the dependent variable.

• For linear combinations (and principal component analysis), only `n`, `means`, and `covs` are required

• For products and logical combinations, the additional items `predictors` and `responses` are required. These are named lists of objects of class `predictor` generated by `new_predictor`, with a `predictor` object for each independent variable in `predictors` and each dependent variable in `responses`. However, if only modeling the product or logical combination of only two variables, `responses` can be `NULL` without consequence.

If modeling a principal component score of a set of variables, include the argument `comp` where `comp` is an integer indicating which principal component score to analyze. Optional logical arguments `center` and `standardize` determine if responses should be centered and standardized before principal components are calculated.

If modeling a linear combination, include the argument `phi`, a named vector of linear weights for each variable in the dependent variable in formula.

If modeling a product, include the argument `response`, a character equal to either `"continuous"` or `"binary"`. If `"binary"`, specialized approximations are performed to estimate means and variances.

### Value

an object of class `"pcsslm"`.

An object of class `"pcsslm"` is a list containing at least the following components:

 `call` the matched call `terms` the `terms` object used `coefficients` a `p x 4` matrix with columns for the estimated coefficient, its standard error, t-statistic and corresponding (two-sided) p-value. `sigma` the square root of the estimated variance of the random error. `df` degrees of freedom, a 3-vector `p, n-p, p*`, the first being the number of non-aliased coefficients, the last being the total number of coefficients. `fstatistic` a 3-vector with the value of the F-statistic with its numerator and denominator degrees of freedom. `r.squared` `R^2`, the 'fraction of variance explained by the model'. `adj.r.squared` the above `R^2` statistic 'adjusted', penalizing for higher `p`. `cov.unscaled` a `p x p` matrix of (unscaled) covariances of the `coef[j], j=1,...p`. `Sum Sq` a 3-vector with the model's Sum of Squares Regression (SSR), Sum of Squares Error (SSE), and Sum of Squares Total (SST).

### References

\insertRef

wolf_using_2021pcsstools

\insertRef

wolf_computationally_2020pcsstools

\insertRef

`model_combo`, `model_product`, `model_or`, `model_and`, and `model_prcomp`.

### Examples

``````## Principal Component Analysis
ex_data <- pcsstools_example[c("g1", "x1", "y1", "y2", "y3")]
pcss <- list(
means = colMeans(ex_data),
covs = cov(ex_data),
n = nrow(ex_data)
)

pcsslm(y1 + y2 + y3 ~ g1 + x1, pcss = pcss, comp = 1)

## Linear combination of variables
ex_data <- pcsstools_example[c("g1", "g2", "y1", "y2")]
pcss <- list(
means = colMeans(ex_data),
covs = cov(ex_data),
n = nrow(ex_data)
)

pcsslm(y1 + y2 ~ g1 + g2, pcss = pcss, phi = c(1, -1))
summary(lm(y1 - y2 ~ g1 + g2, data = ex_data))

## Product of variables
ex_data <- pcsstools_example[c("g1", "x1", "y4", "y5", "y6")]

pcss <- list(
means = colMeans(ex_data),
covs = cov(ex_data),
n = nrow(ex_data),
predictors = list(
g1 = new_predictor_snp(maf = mean(ex_data\$g1) / 2),
x1 = new_predictor_normal(mean = mean(ex_data\$x1), sd = sd(ex_data\$x1))
),
responses = lapply(
colMeans(ex_data)[3:length(colMeans(ex_data))],
new_predictor_binary
)
)

pcsslm(y4 * y5 * y6 ~ g1 + x1, pcss = pcss, response = "binary")
summary(lm(y4 * y5 * y6 ~ g1 + x1, data = ex_data))

## Disjunct (OR statement) of variables
ex_data <- pcsstools_example[c("g1", "x1", "y4", "y5")]

pcss <- list(
means = colMeans(ex_data),
covs = cov(ex_data),
n = nrow(ex_data),
predictors = list(
g1 = new_predictor_snp(maf = mean(ex_data\$g1) / 2),
x1 = new_predictor_normal(mean = mean(ex_data\$x1), sd = sd(ex_data\$x1))
)
)
pcsslm(y4 | y5 ~ g1 + x1, pcss = pcss)
summary(lm(y4 | y5 ~ g1 + x1, data = ex_data))

``````

jackmwolf/pcsstools documentation built on July 7, 2024, 7:46 p.m.