glm.bsreg: Variable selection in generalised linear regression models...

View source: R/glm.bsreg.R

Backward selection with generalised linear regression modelsR Documentation

Variable selection in generalised linear regression models with backward selection

Description

Variable selection in generalised linear regression models with backward selection

Usage

glm.bsreg(target, dataset, threshold = 0.05, wei = NULL, test = NULL)
glm.bsreg2(target, dataset, threshold = 0.05, wei = NULL, test = NULL)

Arguments

target

The class variable. Provide either an integer, a numeric value, or a factor. It can also be a matrix with two columns for the case of binomial regression. In this case, the first column is the nubmer of successes and the second column is the number of trials. See also the Details.

dataset

The dataset; provide either a data frame or a matrix (columns = variables, rows = observations). In either case, only two cases are avaialble, either all data are continuous, or categorical.

threshold

Threshold (suitable values in (0, 1)) for assessing p-values significance. Default value is 0.05.

wei

A vector of weights to be used for weighted regression. The default value is NULL. An example where weights are used is surveys when stratified sampling has occured.

test

For "glm.bsreg" this can be "testIndLogistic", "testIndPois", "testIndBinom", testIndReg" or "testIndMMReg". For "glm.bsreg2" this can be "testIndGamma", "testIndNormLog", "testIndQPois" or "testIndQBinom".

Details

This functions currently implements only linear, binomial, binary logistic and Poisson regression. If the sample size is less than the number of variables a meesage will appear and no backward regression is performed.

Value

The output of the algorithm is S3 object including:

runtime

The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time.

info

A matrix with the variables and their latest test statistics and logged p-values.

mat

A matrix with the selected variables and their latest test statistic and logged p-value.

ci_test

The conditional independence test used.

final

The final regression model.

Author(s)

Michail Tsagris

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr

See Also

fs.reg, lm.fsreg, bic.fsreg, bic.glm.fsreg, CondIndTests, MMPC, SES

Examples

set.seed(123)

#simulate a dataset with continuous data
dataset <- matrix( runif(200 * 10, 1, 100), ncol = 10 )

#define a simulated class variable 
target <- rpois(200, 10)
a <- glm.bsreg(target, dataset, threshold = 0.05) 

target <- rbinom(200, 1, 0.6)
b <- glm.bsreg(target, dataset, threshold = 0.05)

target <- rgamma(200, 1, 2)
b1 <- glm.bsreg2(target, dataset, threshold = 0.05, test = "testIndGamma")
b2 <- glm.bsreg2(target, dataset, threshold = 0.05, test = "testIndNormLog")

MXM documentation built on Aug. 25, 2022, 9:05 a.m.