This translates the
sampsize argument to
gensemble to a form for use internally.
The response vector.
The desired sample size(s). Can be NULL, a single value, a vector or a list. See the details section for more information.
sampsize indicates how much of the underlying data should be used in the bagged model. It should either be
NULL or a single value. If it is
NULL, roughly 80
For classification, the internals of
gensemble require a list of each class and the size of the sample from each class. If
NULL, this list will be built using the levels present in
Y, and roughly 80
Y is a factor, will return a list of each class and the number of data points to sample for that class. Otherwise it will return a single value.
Peter Werner <firstname.lastname@example.org>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#regression Y <- trees[,3] #use roughly 80% for each training iteration mksampsize(Y) #the same thing using proportion mksampsize(Y, 0.8, TRUE) #classification Y <- iris[,5] #use rougly 80% of each class mksampsize(Y) #specifiy the size of each class in absolute terms mksampsize(Y, list(setosa=20, versicolor=30, virginica=40)) #use about 70% of each class mksampsize(Y, 0.7, proportion=TRUE) #specifiy the proportion for each class mksampsize(Y, c(0.5, 0.6, 0.7), proportion=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.