fitTsfm.control: List of control parameters for 'fitTsfm'

View source: R/fitTsfm.control.R

fitTsfm.controlR Documentation

List of control parameters for fitTsfm

Description

Creates a list of control parameters for fitTsfm. All control parameters that are not passed to this function are set to default values. This function is meant for internal use only!!

Usage

fitTsfm.control(
  decay = 0.95,
  weights,
  model = TRUE,
  x = FALSE,
  y = FALSE,
  qr = TRUE,
  nrep = NULL,
  bb = 0.5,
  efficiency = 0.95,
  family = "mopt",
  tuning.psi,
  tuning.chi,
  compute.rd = FALSE,
  corr.b = TRUE,
  split.type = "f",
  initial = "S",
  max.it = 100,
  refine.tol = 1e-07,
  rel.tol = 1e-07,
  refine.PY = 10,
  solve.tol = 1e-07,
  trace.lev = 0,
  psc_keep = 0.5,
  resid_keep_method = "threshold",
  resid_keep_thresh = 2,
  resid_keep_prop = 0.2,
  py_maxit = 20,
  py_eps = 1e-05,
  mscale_maxit = 50,
  mscale_tol = 1e-06,
  mscale_rho_fun = "bisquare",
  scope,
  scale,
  direction,
  steps = 1000,
  k = 2,
  nvmin = 1,
  nvmax = 8,
  force.in = NULL,
  force.out = NULL,
  method,
  really.big = FALSE,
  type,
  normalize = TRUE,
  eps = .Machine$double.eps,
  max.steps,
  plot.it = FALSE,
  lars.criterion = "Cp",
  K = 10
)

Arguments

decay

a scalar in (0, 1] to specify the decay factor for "DLS". Default is 0.95.

weights

an optional vector of weights to be used in the fitting process for fit.method="LS","Robust", or variable.selection="subsets". Should be NULL or a numeric vector. The length of weights must be the same as the number of observations. The weights must be nonnegative and it is strongly recommended that they be strictly positive.

model, x, y, qr

logicals passed to lm for fit.method="LS". If TRUE the corresponding components of the fit (the model frame, the model matrix, the response, the QR decomposition) are returned.

nrep

the number of random subsamples to be drawn for fit.method="Robust". If the data set is small and "Exhaustive" resampling is being used, the value of nrep is ignored.

bb

tuning constant (between 0 and 1/2) for the M-scale used to compute the initial S-estimator. It determines the robustness (breakdown point) of the resulting MM-estimator, which is bb. Defaults to 0.5.

efficiency

desired asymptotic efficiency of the final regression M-estimator. Defaults to 0.85.

family

string specifying the name of the family of loss function to be used (current valid options are "bisquare", "optimal" and "modopt" from the RobStatTM package). Incomplete entries will be matched to the current valid options.

tuning.psi

tuning parameters for the regression M-estimator computed with a rho function as specified with argument family. If missing, it is computed inside lmrobdet.control to match the value of efficiency according to the family of rho functions specified in family. Appropriate values for tuning.psi for a given desired efficiency for Gaussian errors can be constructed using the functions bisquare, mopt and opt.

tuning.chi

tuning constant for the function used to compute the M-scale used for the initial S-estimator. If missing, it is computed inside lmrobdet.control to match the value of bb according to the family of rho functions specified in family.

compute.rd

logical value indicating whether robust leverage distances need to be computed.

corr.b

logical value indicating whether a finite-sample correction should be applied to the M-scale parameter bb.

split.type

determines how categorical and continuous variables are split. See splitFrame.

initial

string specifying the initial value for the M-step of the MM-estimator. Valid options are 'S', for an S-estimator and 'MS' for an M-S estimator which is appropriate when there are categorical explanatory variables in the model.

max.it

maximum number of IRWLS iterations for the MM-estimator

refine.tol

relative convergence tolerance for the S-estimator

rel.tol

relative convergence tolerance for the IRWLS iterations for the MM-estimator

refine.PY

number of refinement steps for the Pen~a-Yohai candidates

solve.tol

relative tolerance for inversion

trace.lev

positive values (increasingly) provide details on the progress of the MM-algorithm

psc_keep

For pyinit, proportion of observations to remove based on PSCs. The effective proportion of removed observations is adjusted according to the sample size to be prosac*(1-p/n). See pyinit.

resid_keep_method

For pyinit, how to clean the data based on large residuals. If "threshold", all observations with scaled residuals larger than C.res will be removed, if "proportion", observations with the largest prop residuals will be removed. See pyinit.

resid_keep_thresh

See parameter resid_keep_method above. See pyinit.

resid_keep_prop

See parameter resid_keep_method above. See pyinit.

py_maxit

Maximum number of iterations. See pyinit.

py_eps

Relative tolerance for convergence. See pyinit.

mscale_maxit

Maximum number of iterations for the M-scale algorithm. See pyinit.

mscale_tol

Convergence tolerance for the M-scale algorithm. See pyinit.

mscale_rho_fun

String indicating the loss function used for the M-scale. See pyinit.

scope

defines the range of models examined in the "stepwise" search. This should be either a single formula, or a list containing components upper and lower, both formulae. See step for how to specify the formulae and usage.

scale

optional parameter for variable.selection="stepwise". The argument is passed to step or step.lmrobdetMM as appropriate.

direction

the mode of "stepwise" search, can be one of "both", "backward", or "forward", with a default of "both". If the scope argument is missing the default for direction is "backward".

steps

the maximum number of steps to be considered for "stepwise". Default is 1000 (essentially as many as required). It is typically used to stop the process early.

k

the multiple of the number of degrees of freedom used for the penalty in "stepwise". Only k = 2 gives the genuine AIC. k = log(n) is sometimes referred to as BIC or SBC. Default is 2.

nvmin

minimum size of subsets to examine for "subsets". Default is 1.

nvmax

maximum size of subsets to examine for "subsets". Default is 8.

force.in

index to columns of design matrix that should be in all models for "subsets". Default is NULL.

force.out

index to columns of design matrix that should be in no models for "subsets". Default is NULL.

method

one of "exhaustive", "forward", "backward" or "seqrep" (sequential replacement) to specify the type of subset search/selection. Required if variable selection="subsets" is chosen. Default is "exhaustive".

really.big

option for "subsets"; Must be TRUE to perform exhaustive search on more than 50 variables.

type

option for "lars". One of "lasso", "lar", "forward.stagewise" or "stepwise". The names can be abbreviated to any unique substring. Default is "lasso".

normalize

option for "lars". If TRUE, each variable is standardized to have unit L2 norm, otherwise they are left alone. Default is TRUE.

eps

option for "lars"; An effective zero.

max.steps

Limit the number of steps taken for "lars"; the default is 8 * min(m, n-intercept), with m the number of variables, and n the number of samples. For type="lar" or type="stepwise", the maximum number of steps is min(m,n-intercept). For type="lasso" and especially type="forward.stagewise", there can be many more terms, because although no more than min(m,n-intercept) variables can be active during any step, variables are frequently droppped and added as the algorithm proceeds. Although the default usually guarantees that the algorithm has proceeded to the saturated fit, users should check.

plot.it

option to plot the output for cv.lars. Default is FALSE.

lars.criterion

an option to assess model selection for the "lars" method; one of "Cp" or "cv". See details. Default is "Cp".

K

number of folds for computing the K-fold cross-validated mean squared prediction error for "lars". Default is 10.

trace

If positive (or, not FALSE), info is printed during the running of step, lars or cv.lars as relevant. Larger values may give more detailed information. Default is FALSE.

Details

This control function is used to process optional arguments passed via ... to fitTsfm. These arguments are validated and defaults are set if necessary before being passed internally to one of the following functions: lm, lmrobdetMM, step, regsubsets, lars and cv.lars. See their respective help files for more details. The arguments to each of these functions are listed above in approximately the same order for user convenience.

The scalar decay is used by fitTsfm to compute exponentially decaying weights for fit.method="DLS". Alternately, one can directly specify weights, a weights vector, to be used with "LS" or "Robust". Especially when fitting multiple assets, care should be taken to ensure that the length of the weights vector matches the number of observations (excluding cases ignored due to NAs).

lars.criterion selects the criterion (one of "Cp" or "cv") to determine the best fitted model for variable.selection="lars". The "Cp" statistic (defined in page 17 of Efron et al. (2004)) is calculated using summary.lars. While, "cv" computes the K-fold cross-validated mean squared prediction error using cv.lars.

Value

A list of the above components. This is only meant to be used by fitTsfm.

Author(s)

Sangeetha Srinivasan

References

Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407-499.

See Also

fitTsfm, lm, lmrobdetMM, step, regsubsets, lars and cv.lars

Examples

## Not run: 
# check argument list passed by fitTsfm.control
tsfm.ctrl <- fitTsfm.control(method="exhaustive", nvmin=2)
print(tsfm.ctrl)

## End(Not run)

# used internally by fitTsfm in the example below
 # load data
data(managers, package = 'PerformanceAnalytics')
 # Make syntactically valid column names
colnames(managers)
colnames(managers) <- make.names( colnames(managers))
colnames(managers)

fit <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
               factor.names=colnames(managers[,(7:9)]), 
               data=managers, variable.selection="subsets", 
               method="exhaustive", nvmin=2)


braverock/factorAnalytics documentation built on Dec. 16, 2024, 1:05 p.m.