View source: R/PrInDTAllparts.R
PrInDTAllparts | R Documentation |
ctrees based on the full sample of the smaller class and consecutive parts of the larger class of the nesting variable 'nesvar'.
The variable 'nesvar' has to be part of the data frame 'datain'.
Interpretability is checked (see 'ctestv'); probability threshold can be specified.
The parameters 'conf.level', 'minsplit', and 'minbucket' can be used to control the size of the trees.
Reference
Weihs, C., Buschfeld, S. 2021b. NesPrInDT: Nested undersampling in PrInDT.
arXiv:2103.14931
PrInDTAllparts(datain, classname, ctestv=NA, conf.level=0.95, thres=0.5,
nesvar, divt,minsplit=NA,minbucket=NA)
datain |
Input data frame with class factor variable 'classname' and the |
classname |
Name of class variable (character) |
ctestv |
Vector of character strings of forbidden split results; |
conf.level |
(1 - significance level) in function |
thres |
Probability threshold for prediction of smaller class (numerical, >= 0 and < 1); default = 0.5 |
nesvar |
Name of nesting variable (character) |
divt |
Number of parts of nesting variable nesvar for which models should be determined individually |
minsplit |
Minimum number of elements in a node to be splitted; |
minbucket |
Minimum number of elements in a node; |
Standard output can be produced by means of print(name)
or just name
where 'name' is the output data
frame of the function.
balanced accuracy of tree on full sample
name of nesting variable
number of consecutive parts of the sample
balanced accuracy of trees on 'divt' consecutive parts of the sample
data <- PrInDT::data_speaker
data <- na.omit(data)
nesvar <- "SPEAKER"
outNesAll <- PrInDTAllparts(data,"class",ctestv=NA,conf.level=0.95,thres=0.5,nesvar,divt=8)
outNesAll
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.