generate_classification_data: Generate classification data.
In classifly: Explore Classification Models in High Dimensions

View source: R/classification.r

generate_classification_data

R Documentation

Generate classification data.

Description

Given a model, this function generates points within the range of the data, classifies them, and attempts to locate boundaries by looking at advantage.

Usage

generate_classification_data(model, data, n, method, advantage)

Arguments

`model`	classification model
`data`	data set used in model
`n`	number of points to generate
`method`	method to use, currently either grid (an evenly spaced grid), random (uniform random distribution across cube), or nonaligned (grid + some random peturbationb)
`advantage`	if `TRUE`, compute advantage, otherwise don't

Details

If posterior probabilities of classification are available, then the advantage() will be calculated directly. If not, class::knn() is used calculate the advantage based on the number of neighbouring points that share the same classification. Because knn is O(n^2) this method is rather slow for large (>20,000 say) data sets.

By default, the boundary points are identified as those below the 5th-percentile for advantage.