Reduction.Phase: Reduction by successive traversal of hypercubes proposed by...

View source: R/Reduction.Phase.R

Reduction.PhaseR Documentation

Reduction by successive traversal of hypercubes proposed by Cox, D. R. & Battey, H. S. (2017)

Description

This function traverses successively lower dimensional hypercubes, discarding variables according to the appropriate decision rules. It provides the number and indices of variables selected at each stage.

Usage

Reduction.Phase(X,Y,family=gaussian,
                dmHC=NULL,vector.signif=NULL,seed.HC = NULL, Cox.Hazard = FALSE)

Arguments

X

Design matrix.

Y

Response vector.

family

A description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. See family for more details.

dmHC

Dimension of the hypercube to be used in the first-stage reduction. This version supports dimensions 2,3,4 and 5. If not specified a sensible value is calculated and used.

vector.signif

Vector of decision rules to be used at each stage of the reduction. The first value makes reference to the decision rule for the highest dimensional hypercube and so on. If values are less than 1, this specifies a significance level of a test. All variables significant at this level in at least half the analyses in which they appear will be retained. If the value is 1 or 2, variables are retained if they are among the 1 or 2 most significant in at least half the analyses in which they appear. If unspecified a default rule is used.

seed.HC

Seed for randomization of the variable indices in the hypercube. If not provided, the variables are arranged according to their original order.

Cox.Hazard

If TRUE fits proportional hazards regression model. The family argument will be ignored if Cox.Hazard=TRUE.

Value

Matrix.Selection

The number of variables selected at each reduction of the hypercube.

List.Selection

The indices of the variables retained through each stage of the reduction phase.

Acknowledgement

The work was supported by the UK Engineering and Physical Sciences Research Council under grant number EP/P002757/1.

Author(s)

Hoeltgebaum, H. H.

References

Cox, D. R. and Battey, H. S. (2017). Large numbers of explanatory variables, a semi-descriptive analysis. Proceedings of the National Academy of Sciences, 114(32), 8592-8595.

Battey, H. S. and Cox, D. R. (2018). Large numbers of explanatory variables: a probabilistic assessment. Proceedings of the Royal Society of London, A., 474(2215), 20170631.

Hoeltgebaum, H., & Battey, H. S. (2019). HCmodelSets: An R Package for Specifying Sets of Well-fitting Models in High Dimensions. The R Journal, 11(2), 370-379.

Examples


## Generates a random DGP
dgp = DGP(s=5, a=3, sigStrength=1, rho=0.9, n=100, intercept=5, noise=1,
          var=1, d=1000, DGP.seed = 2018)

#Reduction Phase using only the first 70 observations
outcome.Reduction.Phase =  Reduction.Phase(X=dgp$X[1:70,],Y=dgp$Y[1:70],
                                           family=gaussian, seed.HC = 1012)

# Not run, using vector.signif argument
# Fixing a decision rule of getting the 2 most significant in the first reduction
# and in the subsequent reduction, only those variables significant at 0.001 level
# outcome.Reduction.Phase =  Reduction.Phase(X=dgp$X[1:70,],Y=dgp$Y[1:70],
#                           vector.signif = c(2,0.001), family=gaussian, dmHC = 3)





HCmodelSets documentation built on March 31, 2023, 7:02 p.m.