plot_fit_residuals: plot_fit_residuals

View source: R/plot_fit_residuals.R

plot_fit_residualsR Documentation

plot_fit_residuals

Description

Function creates a scatter plot of the fitted versus residual values.

Function returns a ggplot2 scatter plot object which can be further modified.

For the standardized residuals, the raw residuals are divided by their standard deviation. The square root of the absolute value of each standardized residual is then plotted against their respective fitted value.

Usage

plot_fit_residuals(
  fitted_v,
  residual_v,
  id_v = NULL,
  residual_standardized = FALSE,
  label_threshold = NULL,
  label_color = "red",
  label_sd = NULL,
  title = NULL,
  subtitle = NULL,
  x_title = NULL,
  y_title = NULL,
  rot_y_tic_label = FALSE,
  x_limits = NULL,
  x_major_breaks = waiver(),
  x_minor_breaks = waiver(),
  x_labels = waiver(),
  x_log10 = FALSE,
  y_limits = NULL,
  y_major_breaks = waiver(),
  y_minor_breaks = waiver(),
  y_labels = waiver(),
  y_log10 = FALSE,
  axis_text_size = 11,
  pts_color = "black",
  pts_fill = "white",
  pts_shape = 21,
  pts_stroke = 1,
  pts_alpha = 1,
  pts_size = 1,
  trend_line = TRUE,
  trend_line_color = "red",
  trend_line_size = 1,
  zero_line_color = "blue",
  zero_line_width = 1.4,
  show_major_grids = TRUE,
  show_minor_grids = TRUE
)

Arguments

fitted_v

A required numeric vector of fitted values.

residual_v

A required numeric vector of corresponding residual values.

id_v

An optional numeric/string vector that labels the fit/residual pairs. If this argument is NULL then the fit/residual pairs are numbered for identification.

residual_standardized

A logical which if TRUE will divide the raw residuals by their estimated standard deviation. The square root of the absolute value of each standardized residual is then plotted against their respective fitted value.

label_threshold

A numeric that sets the residual threshold beyond which observations will be labeled with their id.

label_color

A string that sets the label/point color for observations whose absolute residual is greater than the 'label_threshold'.

label_sd

A numeric that sets the number times +/- residual standard deviations to plot as a pair of horizontal dotted lines. Typical values could be 1 or 2 standard deviations.

title

A string that sets the plot title.

subtitle

A string that sets the plot subtitle.

x_title

A string that sets the x axis title. If NULL (the default) then the x axis title does not appear.

y_title

A string that sets the y axis title. If NULL then the y axis title does not appear.

rot_y_tic_label

A logical which if TRUE rotates the y tic labels 90 degrees for enhanced readability.

x_limits

Depending on the class of 'fitted_v', a numeric/Date/POSIXct 2 element vector that sets the minimum and maximum for the x axis. Use NA to refer to the existing minimum and maximum.

x_major_breaks

Depending on the class of 'fitted_v', a numeric/Date/POSIXct vector or function that defines the exact major tic locations along the x axis.

x_minor_breaks

Depending on the class of 'fitted_v', a numeric/Date/POSIXct vector or function that defines the exact minor tic locations along the x axis.

x_labels

A character vector with the same length as 'x_major_breaks', that labels the major tics.

x_log10

A logical which if TRUE will use a log10 scale for the x axis.

y_limits

A numeric 2 element vector that sets the minimum and maximum for the y axis. Use NA to refer to the existing minimum and maximum.

y_major_breaks

A numeric vector or function that defines the exact major tic locations along the y axis.

y_minor_breaks

A numeric vector or function that defines the exact minor tic locations along the y axis.

y_labels

A character vector with the same length as 'y_major_breaks', that labels the major tics.

y_log10

A logical which if TRUE will use a log10 scale for the y axis.

axis_text_size

A numeric that sets the font size along the axis'. Default is 11.

pts_color

A string that sets the color of the points.

pts_fill

A string that sets the fill color of the points.

pts_shape

A numeric integer that sets the shape of the points. Typical values are 21 “circle”, 22 “square”, 23 “diamond”, 24 “up triangle”, 25 “down triangle”.

pts_stroke

A numeric that sets the drawing width for a point shape.

pts_alpha

A numeric value that sets the alpha level of 'pts_color'.

pts_size

A numeric value that sets the size of the points.

trend_line

A logical which if TRUE plots a polynomial based trend line across the residuals.

trend_line_color

A string that sets the color of the trend line.

trend_line_size

A numeric that sets the width of the trend line.

zero_line_color

A string that sets the color of the zero horizontal reference line.

zero_line_width

A numeric that sets the width of the zero horizontal reference line.

show_major_grids

A logical that controls the appearance of major grids.

show_minor_grids

A logical that controls the appearance of minor grids.

Value

Function returns a ggplot2 object of fitted vs residual values.

Examples

library(wooldridge)
library(ggplot2)
library(data.table)
library(RplotterPkg)
library(RregressPkg)

hprice1_dt <- data.table::as.data.table(wooldridge::hprice1) |>
_[,.(price, lotsize, sqrft, bdrms)]

housing_price_lm <- price ~ lotsize + sqrft + bdrms
housing_price_ols <- RregressPkg::ols_calc(
  df = hprice1_dt,
  formula_obj = housing_price_lm
)
a_plot <- RregressPkg::plot_fit_residuals(
  fitted_v = housing_price_ols$fitted_vals,
  residual_v = housing_price_ols$residual_vals,
  subtitle = "Data from housing prices",
  x_title = "Fitted",
  y_title = "Residuals",
  trend_line = FALSE,
  zero_line_color = "darkorange",
  zero_line_width = 0.8,
  label_threshold = 100,
  label_sd = 1.0
)


deandevl/RregressPkg documentation built on Feb. 5, 2025, 12:11 p.m.