Description Usage Arguments Details Value
A common use-case for this is when you have imbalanced classes in your training data for a classifier.
1 |
data |
A tbl or data.frame. Internally, this function uses dplyr verbs, so it will work for local tables and remote tables in the warehouse. |
var |
Unquoted variable name to use for balancing. |
Although you might most commonly use this function for binary outcomes, it will also work if 'var' has more than two values. In that case, the subset for each value of 'var' will be sampled down to match the number of rows in the least common value of 'var'.
Note, however, that the set of unique values of 'var' is pulled into local memory when working with remote tbls. As such, you probably shouldn't try to balance by a categorical variable with many, many values.
'data' balanced by 'var'
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.