# ase_seq_logit: variable selection and stopping criterion In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

## Description

`ase_seq_logit` determine the effective variables and whether to stop selecting samples

## Usage

 ```1 2 3``` ```ase_seq_logit(X, Y, intercept = FALSE, criterion = "BIC", d = 0.5, alpha = 0.95, gamma = 1, eta = 0.75, upper = 2, lower = 0.1, divid.num = 10) ```

## Arguments

 `X` A dataframe that each row is a sample,each column represents an independent variable. `Y` Numeric vector consists of 0 or 1. The length of Y must be the same as the X. `intercept` A logical value indicating whether add intercept to model. The default value is FALSE. `criterion` For the "chosfun" methods, a character string that determines the model selection criterion to be used, matching one of 'BIC' or 'AIC. The default value is 'BIC'. `d` A numeric number specifying the length of the fixed size confidence set for our model. Note that the smaller the d, the larger the sample size and the longer the time costs. The default value is 0.5. `alpha` A numeric number used in the chi-square distribution. The default value is 0.95. `gamma` A numeric number to determine the effective variables with eta. The default value is 1. `eta` A numeric number to determine the effective variables with gamma. The default value is 0.75. `upper` A numeric number to choose the right epsilon with params lower and divide.num. The value of upper should be larger than lower. The default value is 2. `lower` A numeric number to choose the right epsilon with params upper and divide.num. The default value is 0.1. `divid.num` A numeric number to choose the right epsilon with params upper and lower. Note that it should be a integer. The default value is 10.

## Details

ase_seq_logit estimates the logistic regression coefficient and determines the effecrive variables and decides whether to stop selecting samples based on the current sample and its corresponding label. The parameters 'upper', 'lower' and 'divid.num' is used to get different epsilons. If different epsilons get the same value, we choose the smallest epsilon.

## Value

a list containing the following components

 `N` current sample size `is_stopped` the label of sequential stop or not. When the value of is_stopped is 1, it means the iteration stops `betahat` the estimated coefficients based on current X and Y. Note that some coefficient will be zero. These are the non-effectiva variables should be ignored. `cov` the covariance matrix between variables `phat` the number of effective varriables. `ak` 1-alpha quantile of chisquare distribution with degree of freedom phat `lamdmax` the maximum eigenvalue based on the covariance of data

seqest documentation built on July 2, 2020, 2:28 a.m.