influence.lmw: Regression Diagnostics for 'lmw' and 'lmw_est' objects

View source: R/influence.lmw.R

influence.lmwR Documentation

Regression Diagnostics for lmw and lmw_est objects

Description

influence() produces influence measures for lmw objects that can be used as regression diagnostics to identify influential cases. These functions produce similar outputs to lm.influence() but also include the sample influence curve (SIC) values, which combine information about the hat values, residuals, and implied regression weights.

Usage

## S3 method for class 'lmw'
influence(model, outcome, data = NULL, ...)

## S3 method for class 'lmw_est'
influence(model, ...)

Arguments

model

an lmw or lmw_est object; the output of a call to lmw() or lmw_est().

outcome

the name of the outcome variable. Can be supplied as a string containing the name of the outcome variable or as the outcome variable itself. If not supplied, the outcome variable in the formula supplied to lmw(), if any, will be used.

data

an optional data frame containing the outcome variable named in outcome.

...

ignored.

Details

influence() computes the hat values, (weighted) residuals, and sample influence curve (SIC) values for each unit, which can be used as regression diagnostics to assess influence. The weighted residuals are weighted by the sampling weights (if supplied), not the implied regression weights. The SIC values are computed as SIC = (N-1) * w * r / (1 - h), where N is the sample size, w are the units' implied regression weights, r are the (weighted) residuals, and h are the hat values. SIC values are scaled to have a maximum of 1. Higher values indicate greater relative influence.

Value

A list with the following components:

hat

a vector containing the diagonal of the hat matrix.

wt.res

a vector of (weighted) residuals.

sic

a vector containing the scaled SIC values.

Note

influence.lmw() uses non-standard evaluation to interpret its outcome argument. For programmers who wish to use influence.lmw() inside other functions, an effective way to pass the name of an arbitrary outcome (e.g., y passed as a string) is to use do.call(), for example:

fun <- function(m, y, d) {
do.call("influence", list(m, y, d)) } 

When using influence.lmw() inside lapply() or purrr::map to loop over outcomes, this syntax must be used as well.

See Also

plot.lmw() for plotting the SIC values; lm.influence() for influence measures for lm objects, which do not include SIC values; hatvalues() for hat values for lm objects (note that lmw_est objects also have a hatvalues() method).

Examples


data("lalonde")

# URI regression for ATT
lmw.out1 <- lmw(~ treat + age + education + race + married +
                     nodegree + re74 + re75,
                data = lalonde, estimand = "ATT",
                method = "URI", treat = "treat")

# Influence for re78 outcome
infl <- influence(lmw.out1, outcome = "re78")
str(infl)

# Can also be used after lmw_est():
lmw.est1 <- lmw_est(lmw.out1, outcome = "re78")
all.equal(infl,
          influence(lmw.est1))


ngreifer/lmw documentation built on Feb. 14, 2024, 10:53 p.m.