ase_seq_logit: variable selection and stopping criterion

Description Usage Arguments Details Value

View source: R/ase_seq_logit.R

Description

ase_seq_logit determine the effective variables and whether to stop selecting samples

Usage

1
2
3
ase_seq_logit(X, Y, intercept = FALSE, criterion = "BIC", d = 0.5,
  alpha = 0.95, gamma = 1, eta = 0.75, upper = 2, lower = 0.1,
  divid.num = 10)

Arguments

X

A dataframe that each row is a sample,each column represents an independent variable.

Y

Numeric vector consists of 0 or 1. The length of Y must be the same as the X.

intercept

A logical value indicating whether add intercept to model. The default value is FALSE.

criterion

For the "chosfun" methods, a character string that determines the model selection criterion to be used, matching one of 'BIC' or 'AIC. The default value is 'BIC'.

d

A numeric number specifying the length of the fixed size confidence set for our model. Note that the smaller the d, the larger the sample size and the longer the time costs. The default value is 0.5.

alpha

A numeric number used in the chi-square distribution. The default value is 0.95.

gamma

A numeric number to determine the effective variables with eta. The default value is 1.

eta

A numeric number to determine the effective variables with gamma. The default value is 0.75.

upper

A numeric number to choose the right epsilon with params lower and divide.num. The value of upper should be larger than lower. The default value is 2.

lower

A numeric number to choose the right epsilon with params upper and divide.num. The default value is 0.1.

divid.num

A numeric number to choose the right epsilon with params upper and lower. Note that it should be a integer. The default value is 10.

Details

ase_seq_logit estimates the logistic regression coefficient and determines the effecrive variables and decides whether to stop selecting samples based on the current sample and its corresponding label. The parameters 'upper', 'lower' and 'divid.num' is used to get different epsilons. If different epsilons get the same value, we choose the smallest epsilon.

Value

a list containing the following components

N

current sample size

is_stopped

the label of sequential stop or not. When the value of is_stopped is 1, it means the iteration stops

betahat

the estimated coefficients based on current X and Y. Note that some coefficient will be zero. These are the non-effectiva variables should be ignored.

cov

the covariance matrix between variables

phat

the number of effective varriables.

ak

1-alpha quantile of chisquare distribution with degree of freedom phat

lamdmax

the maximum eigenvalue based on the covariance of data


seqest documentation built on July 2, 2020, 2:28 a.m.

Related to ase_seq_logit in seqest...