select_ic: Function selection procedure based on information criteria

View source: R/mfp_step.R

select_icR Documentation

Function selection procedure based on information criteria

Description

Used in find_best_fp_step() when criterion = "aic" or "bic". For parameter explanations, see find_best_fp_step(). All parameters captured by ... are passed on to fit_model().

Usage

select_ic(
  x,
  xi,
  keep,
  degree,
  acdx,
  y,
  powers_current,
  powers,
  criterion,
  ftest,
  select,
  alpha,
  ...
)

select_ic_acd(
  x,
  xi,
  keep,
  degree,
  acdx,
  y,
  powers_current,
  powers,
  criterion,
  ftest,
  select,
  alpha,
  ...
)

Arguments

x

an input matrix of dimensions nobs x nvars. Does not contain intercept, but columns are already expanded into dummy variables as necessary. Data are assumed to be shifted and scaled.

xi

a character string indicating the name of the current variable of interest, for which the best fractional polynomial transformation is to be estimated in the current step.

keep

a character vector with names of variables to be kept in the model.

degree

integer > 0 giving the degree for the FP transformation.

acdx

a logical vector of length nvars indicating continuous variables to undergo the approximate cumulative distribution (ACD) transformation.

y

a vector for the response variable or a Surv object.

powers_current

a list of length equal to the number of variables, indicating the fp powers to be used in the current step for all variables (except xi).

powers

a named list of numeric values that sets the permitted FP powers for each covariate.

criterion

a character string defining the criterion used to select variables and FP models of different degrees.

ftest

a logical indicating the use of the F-test for Gaussian models.

select

a numeric value indicating the significance level for backward elimination of xi.

alpha

a numeric value indicating the significance level for tests between FP models of different degrees for xi.

...

passed to fitting functions.

Details

In case an information criterion is used to select the best model the selection procedure simply fits all relevant models and selects the best one according to the given criterion.

"Relevant" models for a given degree are the null model excluding the variable of interest, the linear model and all best FP models up to the specified degree.

In case an ACD transformation is requested, then the models assessed are the null model, the linear model in x and A(x), the best FP1 models in x and A(x), and the best FP1(x, A(x)) model.

Note that the "best" FPx model used in this function are given by the models using a FPx transformation for the variable of interest and having the highest likelihood of all such models given the current powers for all other variables, as outlined in Section 4.8 of Royston and Sauerbrei (2008). These best FPx models are computed in find_best_fpm_step(). Keep in mind that for a fixed number of degrees of freedom (i.e. fixed m), the model with the highest likelihood is the same as the model with the best information criterion of any kind since all the models share the same penalty term.

When a variable is forced into the model by including it in keep, then this function will not exclude it from the model (by setting its power to NA), but will only choose its functional form.

Value

A list with several components:

  • keep: logical indicating if xi is forced into model.

  • acd: logical indicating if an ACD transformation was applied for xi, i.e. FALSE in this case.

  • powers: (best) fp powers investigated in step, indexing metrics. Ordered by increasing complexity, i.e. null, linear, FP1, FP2 and so on. For ACD transformation, it is null, linear, linear(., A(x)), FP1(x, .), FP1(., A(x)) and FP1(x, A(x)).

  • power_best: a numeric vector with the best power found. The returned best power may be NA, indicating the variable has been removed from the model.

  • metrics: a matrix with performance indices for all best models investigated. Same number of rows as, and indexed by, powers.

  • model_best: row index of best model in metrics.

  • pvalue: p-value for comparison of linear and null model, NA in this case..

  • statistic: test statistic used, depends on ftest, NA in this case.

Functions

  • select_ic_acd(): Function to select ACD based transformation.

See Also

select_ra2()


mfp2 documentation built on Nov. 15, 2023, 1:06 a.m.