| pyinit | R Documentation | 
Computes the PY initial estimates for S-estimates of regression.
pyinit(
  x,
  y,
  intercept = TRUE,
  delta = 0.5,
  cc,
  maxit = 10,
  psc_keep,
  resid_keep_method = c("threshold", "proportion"),
  resid_keep_prop,
  resid_keep_thresh,
  eps = 1e-08,
  mscale_maxit = 200,
  mscale_tol = eps,
  mscale_rho_fun = c("bisquare", "huber", "gauss")
)
| x | a matrix with the data, each observation in a row. | 
| y | the response vector. | 
| intercept | logical, should an intercept be included in the model? Defaults to
 | 
| delta, cc | parameters for the M-scale estimator equation. If  | 
| maxit | the maximum number of iterations to perform. | 
| psc_keep | proportion of observations to keep based on PSCs. | 
| resid_keep_method | how to clean the data based on large residuals.
If  | 
| resid_keep_prop, resid_keep_thresh | see parameter
 | 
| eps | the relative tolerance for convergence. Defaults to  | 
| mscale_maxit | maximum number of iterations allowed for the M-scale
algorithm. Defaults to  | 
| mscale_tol | convergence threshold for the m-scale | 
| mscale_rho_fun | A string containing the name of the rho
function to use for the M-scale. Valid options
are  | 
| coefficients | numeric matrix with coefficient vectors in columns. These
are regression estimators based on "cleaned" subsets of the data. The
M-scales of the corresponding residuals are returned in the entry
 | 
| objective | vector of values of the M-scale estimate of the residuals
associated with each vector of regression coefficients in the columns of
 | 
Pena, D., & Yohai, V.. (1999). A Fast Procedure for Outlier Diagnostics in Large Regression Problems. Journal of the American Statistical Association, 94(446), 434-445. <doi:10.2307/2670164>
# generate a simple synthetic data set for a linear regression model # with true regression coefficients all equal to one "(1, 1, 1, 1, 1)" set.seed(123) x <- matrix(rnorm(100*4), 100, 4) y <- rnorm(100) + rowSums(x) + 1 # add masked outliers a <- svd(var(x))$v[,4] x <- rbind(x, t(outer(a, rnorm(20, mean=4, sd=1)))) y <- c(y, rnorm(20, mean=-2, sd=.2)) # these outliers are difficult to find plot(lm(y~x), ask=FALSE) # use pyinit to obtain estimated regression coefficients tmp <- pyinit(x=x, y=y, resid_keep_method='proportion', psc_keep = .5, resid_keep_prop=.5) # the vector of regression coefficients with smallest residuals scale # is returned in the first column of the "coefficients" element tmp$coefficients[,1] # compare that with the LS estimator on the clean data coef(lm(y~x, subset=1:100)) # compare it with the LS estimator on the full data coef(lm(y~x))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.