mlogreg: Multinomial Logic Regression In logicFS: Identification of SNP Interactions

Description

Performs a multinomial logic regression for a nominal response by fitting a logic regression model (with logit as link function) for each of the levels of the response except for the level with the smallest value which is used as reference category.

Usage

 ```1 2 3 4 5 6``` ```## S3 method for class 'formula' mlogreg(formula, data, recdom = TRUE, ...) ## Default S3 method: mlogreg(x, y, ntrees = 1, nleaves = 8, anneal.control = logreg.anneal.control(), select = 1, rand = NA, ...) ```

Arguments

 `formula` an object of class `formula` describing the model that should be fitted. `data` a data frame containing the variables in the model. Each column of `data` must correspond to a binary variable (coded by 0 and 1) or a factor (for details on factors, see `recdom`) except for the column comprising the response, and each row to an observation. The response must be a categorical variable with less than 10 levels. This response can be either a factor or of type `numeric` or `character`. `recdom` a logical value or vector of length `ncol(data)` comprising whether a SNP should be transformed into two binary dummy variables coding for a recessive and a dominant effect. If `TRUE` (logical value), then all factors (variables) with three levels will be coded by two dummy variables as described in `make.snp.dummy`. Each level of each of the other factors (also factors specifying a SNP that shows only two genotypes) is coded by one indicator variable. If `FALSE` (logical value), each level of each factor is coded by an indicator variable. If `recdom` is a logical vector, all factors corresponding to an entry in `recdom` that is `TRUE` are assumed to be SNPs and transformed into the two binary variables described above. Each variable that corresponds to an entry of `recdom` that is `TRUE` (no matter whether `recdom` is a vector or a value) must be coded by the integers 1 (coding for the homozygous reference genotype), 2 (heterozygous), and 3 (homozygous variant). `x` a matrix consisting of 0's and 1's. Each column must correspond to a binary variable and each row to an observation. `y` either a factor or a numeric or character vector specifying the values of the response. The length of `y` must be equal to the number of rows of `x`. `ntrees` an integer indicating how many trees should be used in the logic regression models. For details, see `logreg` in the `LogicReg package`. `nleaves` a numeric value specifying the maximum number of leaves used in all trees combined. See the help page of the function `logreg` in the `LogicReg` package for details. `anneal.control` a list containing the parameters for simulated annealing. For details, see the help page of `logreg.anneal.control` in the `LogicReg` package. `select` numeric value. Either 0 for a stepwise greedy selection (corresponds to `select = 6` in `logreg`) or 1 for simulated annealing. `rand` numeric value. If specified, the random number generator will be set into a reproducible state. `...` for the `formula` method, optional parameters to be passed to the low level function `mlogreg.default`. Otherwise, ignored.

Value

An object of class `mlogreg` composed of

 `model` a list containing the logic regression models, `data` a matrix containing the binary predictors, `cl` a vector comprising the class labels, `ntrees` a numeric value naming the maximum number of trees used in the logic regressions, `nleaves` a numeric value comprising the maximum number of leaves used in the logic regressions, `fast` a logical value specifying whether the faster search algorithm, i.e.\ the greedy search, has been used.

Author(s)

Holger Schwender, holger.schwender@hhu.de

References

Schwender, H., Ruczinski, I., Ickstadt, K. (2011). Testing SNPs and Sets of SNPs for Importance in Association Studies. Biostatistics, 12, 18-32.

`predict.mlogreg`, `logic.bagging`, `logicFS`