run_lm: lm() for pipes - data as first argument
In LukasWallrich/rNuggets: Useful Helper Functions for (Academic) Data Analysis

run_lm

R Documentation

lm() for pipes - data as first argument

Description

Within a dplyr-pipe, running lm() is often complicated be the placing of the data argument. This wrapper places data first and allows to run standardized models.

Usage

run_lm(df, formula, std = FALSE, rename_std = FALSE, ...)

Arguments

`df`	Data for modeling
`formula`	an object of class `"formula"` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
`std`	Logical. Should variables be standardised? This is only applied to numeric variables, factors are left unchanged so that their coefficients remain interpretable.
`rename_std`	Logical. Should standardised variables be indicated by _sd suffix
`...`	Arguments passed on to `stats::lm` `subset` an optional vector specifying a subset of observations to be used in the fitting process. `weights` an optional vector of weights to be used in the fitting process. Should be `NULL` or a numeric vector. If non-NULL, weighted least squares is used with weights `weights` (that is, minimizing `sum(we^2)`); otherwise ordinary least squares is used. See also ‘Details’, `na.action` a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The ‘factory-fresh’ default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful. `method` the method to be used; for fitting, currently only `method = "qr"` is supported; `method = "model.frame"` returns the model frame (the same as with `model = TRUE`, see below). `model` logicals. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned. `x` logicals. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned. `y` logicals. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned. `qr` logicals. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned. `singular.ok` logical. If `FALSE` (the default in S but not in R) a singular fit is an error. `contrasts` an optional list. See the `contrasts.arg` of `model.matrix.default`. `offset` this can be used to specify an a priori* known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector or matrix of extents matching those of the response. One or more `offset` terms can be included in the formula instead or as well, and if more than one are specified their sum is used. See `model.offset`.

Details

Note that the model call in the lm-object is replaced by the call to this function - that means that update() cannot be used.

Source

After experiencing an issue with passing weights, I rewrote this based on the code suggested by "Vandenman" here https://stackoverflow.com/questions/38683076/ellipsis-trouble-passing-to-lm

References

See (Fox, 2015) for an argument why dummy variables should never be standardised. If you want to run a model with all variables standardised, one option is 'QuantPsyc::lm.beta()'

LukasWallrich/rNuggets documentation built on Aug. 26, 2022, 11:03 a.m.