Description Usage Arguments Value Author(s) References See Also Examples
View source: R/logic.bagging.R
A bagging and subsampling version of logic regression. Currently available for the
classification, the linear regression, and the logistic regression approach
of logreg
. Additionally, an approach based on multinomial logistic regressions as
implemented in mlogreg
can be used if the response is categorical.
1 2 3 4 5 6 7 8 9 10 | ## Default S3 method:
logic.bagging(x, y, B = 100, useN = TRUE, ntrees = 1, nleaves = 8,
glm.if.1tree = FALSE, replace = TRUE, sub.frac = 0.632,
anneal.control = logreg.anneal.control(), oob = TRUE,
onlyRemove = FALSE, prob.case = 0.5, importance = TRUE,
score = c("DPO", "Conc", "Brier", "PL"), addMatImp = FALSE, fast = FALSE,
neighbor = NULL, adjusted = FALSE, ensemble = FALSE, rand = NULL, ...)
## S3 method for class 'formula'
logic.bagging(formula, data, recdom = TRUE, ...)
|
x |
a matrix consisting of 0's and 1's. Each column must correspond to a binary variable and each row to an observation. Missing values are not allowed. |
y |
a numeric vector, a factor, or a vector of class |
B |
an integer specifying the number of iterations. |
useN |
logical specifying if the number of correctly classified out-of-bag observations should
be used in the computation of the importance measure. If |
ntrees |
an integer indicating how many trees should be used. For a binary response: If For a continuous response: A linear regression model with For a categorical response: n.lev-1 logic regression models with For a response of class |
nleaves |
a numeric value specifying the maximum number of leaves used
in all trees combined. See the help page of the function |
glm.if.1tree |
if |
replace |
should sampling of the cases be done with replacement? If
|
sub.frac |
a proportion specifying the fraction of the observations that
are used in each iteration to build a classification rule if |
anneal.control |
a list containing the parameters for simulated annealing.
See the help page of |
oob |
should the out-of-bag error rate (classification and logistic regression) or the out-of-bag root mean square prediction error (linear regression), respectively, be computed? |
onlyRemove |
should in the single tree case the multiple tree measure be used? If |
prob.case |
a numeric value between 0 and 1. If the outcome of the
logistic regression, i.e.\ the class probability, for an observation is
larger than |
importance |
should the measure of importance be computed? |
score |
a character string naming the score that should be used in the computation of the importance measure for a survival time analysis. By default, the distance between predicted outcomes ( |
addMatImp |
should the matrix containing the improvements due to the prime implicants
in each of the iterations be added to the output? (For each of the prime implicants,
the importance is computed by the average over the |
fast |
should a greedy search (as implemented in |
neighbor |
a list consisting of character vectors specifying SNPs that are in LD. If specified, all SNPs need to occur exactly one time in this list. If specified, the importance measures are adjusted for LD by considering the SNPs within a LD block as exchangable. |
adjusted |
logical specifying whether the measures should be adjusted for noise. Often, the interaction actually associated with the response is not exactly found in some iterations of logic bagging, but an interaction is identified that additionally contains one (or seldomly more) noise SNPs. If |
ensemble |
in the case of a survival outcome, should |
rand |
numeric value. If specified, the random number generator will be set into a reproducible state. |
formula |
an object of class |
data |
a data frame containing the variables in the model. Each row of |
recdom |
a logical value or vector of length |
... |
for the |
logic.bagging
returns an object of class logicBagg
containing
logreg.model |
a list containing the |
inbagg |
a list specifying the |
vim |
an object of class |
oob.error |
the out-of-bag error (if |
... |
further parameters of the logic regression. |
Holger Schwender, holger.schwender@hhu.de; Tobias Tietz, tobias.tietz@hhu.de
Ruczinski, I., Kooperberg, C., LeBlanc M.L. (2003). Logic Regression. Journal of Computational and Graphical Statistics, 12, 475-511.
Schwender, H., Ickstadt, K. (2007). Identification of SNP Interactions Using Logic Regression. Biostatistics, 9(1), 187-198.
Tietz, T., Selinski, S., Golka, K., Hengstler, J.G., Gripp, S., Ickstadt, K., Ruczinski, I., Schwender, H. (2018). Identification of Interactions of Binary Variables Associated with Survival Time Using survivalFS. Submitted.
predict.logicBagg
, plot.logicBagg
,
logicFS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## Not run:
# Load data.
data(data.logicfs)
# For logic regression and hence logic.bagging, the variables must
# be binary. data.logicfs, however, contains categorical data
# with realizations 1, 2 and 3. Such data can be transformed
# into binary data by
bin.snps<-make.snp.dummy(data.logicfs)
# To speed up the search for the best logic regression models
# only a small number of iterations is used in simulated annealing.
my.anneal<-logreg.anneal.control(start=2,end=-2,iter=10000)
# Bagged logic regression is then performed by
bagg.out<-logic.bagging(bin.snps,cl.logicfs,B=20,nleaves=10,
rand=123,anneal.control=my.anneal)
# The output of logic.bagging can be printed
bagg.out
# By default, also the importances of the interactions are
# computed
bagg.out$vim
# and can be plotted.
plot(bagg.out)
# The original variable names are displayed in
plot(bagg.out,coded=FALSE)
# New observations (here we assume that these observations are
# in data.logicfs) are assigned to one of the classes by
predict(bagg.out,data.logicfs)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.