sequentialOR: Sequential outlier rejection

Description Usage Arguments Value References

View source: R/sequentialOR.R

Description

Iteratively fit a model to data and remove the most extreme residual. Three models are implemented here: ordinary least squares (lm), generalized linear model (glm), generalized additive model (GAM).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sequentialOR(
  data,
  method = "lm",
  formula,
  n.reject = 1,
  n.stop,
  threshold.stop = NULL,
  tail = "both",
  plot = T,
  progress.plot = F,
  ...
)

Arguments

data

Data frame containing predictor and response variables.

method

Character vector of length one indicating which function should be used to fit models. Options are 'lm', 'glm', or 'gam'.

n.reject

Numerical vector of length one. Number of observations to be rejected in each iteration. Larger values speed the funciton up, but may impede precise detection of break points.

n.stop

Numerical vector of length one which indicating the number of observations remaining or proportion of initial observations which should be remaining before outlier rejection stops. For example, with 400 initial observations, n.stop set to 40 or 0.1 (= 400*0.1) would stop the algorithm when 40 observations remain.

tail

Character vector of length one indicating whether to reject from the lower tail, upper tail, or both tails. Default = "both". #' @param plot Logical vector indicating whether the function should print a plot of observations versus RMSE.

...

Additional arguments passed to function calls (family, offset, etc.)

fn

Formula to be passed to the model. Formula must be valid for the selected method. Variables must included in 'data' and the extraction operator ($) should not be used in the forumla.

Value

The function returns a list with (1) input data frame with an additional column indicating the order in which observations were rejected, (2) a data frame containing the number of observations used for model-fitting and the associated RMSE. If argument 'plot=T', also prints a plot of observations versus RMSE given the remaining observations.

References

Kotwicki, S., M. H. Martin, and E. A. Laman. 2011. Improving area swept estimates from bottom trawl surveys. Fisheries Research 110(1):198–206. Elsevier B.V.


sean-rohan/TLUtilities documentation built on Sept. 30, 2021, 2:34 a.m.