eda_rline: Tukey's resistant line

View source: R/eda_rline.R

eda_rlineR Documentation

Tukey's resistant line

Description

eda_rline is an R implementation of Hoaglin, Mosteller and Tukey's resistant line technique outlined in chapter 5 of "Understanding Robust and Exploratory Data Analysis" (Wiley, 1983).

Usage

eda_rline(dat, x, y, px = 1, py = 1, tukey = FALSE, maxiter = 20)

Arguments

dat

Data frame.

x

Column assigned to the x axis.

y

Column assigned to the y axis.

px

Power transformation to apply to the x-variable.

py

Power transformation to apply to the y-variable.

tukey

Boolean determining if a Tukey transformation should be adopted.

maxiter

Maximum number of iterations to run. (FALSE adopts a Box-Cox transformation)

Details

This is an R implementation of the RLIN.F FORTRAN code in Velleman et. al's book. This function fits a robust line using a three-point summary strategy whereby the data are split into three equal length groups along the x-axis and a line is fitted to the medians defining each group via an iterative process. This function should mirror the built-in stat::line function in its fitting strategy but it outputs additional parameters.


See the accompanying vignette Resistant Line for a detailed breakdown of the resistant line technique.

Value

Returns a list of class eda_rlinewith the following named components:

  • a: Intercept

  • b: Slope

  • res: Residuals sorted on x-values

  • x: Sorted x values

  • y: y values following sorted x-values

  • xmed: Median x values for each third

  • ymed: Median y values for each third

  • index: Index of sorted x values defining upper boundaries of each thirds

  • xlab: X label name

  • ylab: Y label name

  • iter: Number of iterations

References

  • Velleman, P. F., and D. C. Hoaglin. 1981. Applications, Basics and Computing of Exploratory Data Analysis. Boston: Duxbury Press.

  • D. C. Hoaglin, F. Mosteller, and J. W. Tukey. 1983. Understanding Robust and Exploratory Data Analysis. Wiley.

Examples


# This first example uses breast cancer data from "ABC's of EDA" page 127.
# The output model's  parameters should closely match:  Y = -46.19 + 2.89X
# The plots shows the original data with a fitted resistant line (red)
# and a regular lm fitted line (dashed line), and the modeled residuals.
# The 3-point summary dots are shown in red.

M <- eda_rline(neoplasms, Temp, Mortality)
M

# Plot the output (red line is the resistant line)
plot(M)

# Add a traditional OLS regression line (dashed line)
abline(lm(Mortality ~ Temp, neoplasms), lty = 3)

# Plot the residuals
plot(M, type = "residuals")

# This next example uses Andrew Siegel's pathological 9-point dataset to test
# for model stability when convergence cannot be reached.
M <- eda_rline(nine_point, X, Y)
plot(M)


mgimond/tukeyedar documentation built on July 29, 2024, 9:16 a.m.