# mksampsize: Generate sample size information for use with 'gensemble' In gensemble: Generalized Ensemble Methods

## Description

This translates the `sampsize` argument to `gensemble` to a form for use internally.

## Usage

 `1` ```mksampsize(Y, sampsize = NULL, proportion = FALSE) ```

## Arguments

 `Y` The response vector. `sampsize` The desired sample size(s). Can be NULL, a single value, a vector or a list. See the details section for more information. `proportion` A `logical` indicating the values in `sampsize` represent proportions.

## Details

For regression, `sampsize` indicates how much of the underlying data should be used in the bagged model. It should either be `NULL` or a single value. If it is `NULL`, roughly 80

For classification, the internals of `gensemble` require a list of each class and the size of the sample from each class. If `sampsize` is `NULL`, this list will be built using the levels present in `Y`, and roughly 80

## Value

If `Y` is a factor, will return a list of each class and the number of data points to sample for that class. Otherwise it will return a single value.

## Author(s)

Peter Werner <gensemble.r@gmail.com>

`gensemble`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` ```#regression Y <- trees[,3] #use roughly 80% for each training iteration mksampsize(Y) #the same thing using proportion mksampsize(Y, 0.8, TRUE) #classification Y <- iris[,5] #use rougly 80% of each class mksampsize(Y) #specifiy the size of each class in absolute terms mksampsize(Y, list(setosa=20, versicolor=30, virginica=40)) #use about 70% of each class mksampsize(Y, 0.7, proportion=TRUE) #specifiy the proportion for each class mksampsize(Y, c(0.5, 0.6, 0.7), proportion=TRUE) ```