Description Usage Arguments Details Value Author(s) References See Also Examples
Using given formula
and data
the method builds a RBF network and extracts its properties thereby preparing a data generator which can be used
with newdata.RBFgenerator
method to generate semi-artificial data.
1 2 | rbfDataGen(formula, data, eps=1e-4, minSupport=1,
nominal=c("encodeBinary","asInteger"))
|
formula |
A formula specifying the response and variables to be modeled. |
data |
A data frame with training data. |
eps |
The minimal probability considered in data generator to be larger than 0. |
minSupport |
The minimal number of instances defining a Gaussian kernel to copy the kernel to the data generator. |
nominal |
The way how to treat nominal features. The option |
Parameter formula
is used as a mechanism to select features (attributes)
and the prediction variable (response, class). Only simple terms can be used and
interaction terms are not supported. The simplest way is
to specify just the response variable using e.g. class ~ .
. See examples below.
A RBF network is build using rbfDDA
from RSNNS package.
The learned Gaussian kernels are extracted and used in data generation with
newdata.RBFgenerator
method.
The created model is returned as a structure of class RBFgenerator
, containing the following items:
noGaussians |
The number of extracted Gaussian kernels. |
centers |
A matrix of Gaussian kernels' centers, with one row for each Gaussian kernel. |
probs |
A vector of kernel probabilities. Probabilities are defined as relative frequencies of training set instances with maximal activation in the given kernel. |
unitClass |
A vector of class values, one for each kernel. |
bias |
A vector of kernels' biases, one for each kernel. The bias is multiplied by the kernel activation to produce output value of given RBF network unit. |
spread |
A matrix of estimated variances for the kernels, one row for each kernel. The j-th value in i-th row represents the variance of training instances for j-th attribute with maximal activation in i-th Gaussian. |
gNoActivated |
A vector containing numbers of training instances with maximal activation in each kernel. |
noAttr |
The number of attributes in training data. |
datNames |
A vector of attributes' names. |
originalNames |
A vector of original attribute names. |
attrClasses |
A vector of attributes' classes (i.e., data types like |
attrLevels |
A list of levels for discrete attributes (with class |
attrOrdered |
A vector of type logical indicating whether the attribute is |
normParameters |
A list of parameters for normalization of attributes to [0,1]. |
noCol |
The number of columns in the internally generated data set. |
isDiscrete |
A vector of type logical, each value indicating whether a respective attribute is discrete. |
noAttrGen |
The number of attributes to generate. |
nominal |
The value of parameter |
Marko Robnik-Sikonja
Marko Robnik-Sikonja: Not enough data? Generate it!. Technical Report, University of Ljubljana, Faculty of Computer and Information Science, 2014
Other references are available from http://lkm.fri.uni-lj.si/rmarko/papers/
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # use iris data set, split into training and testing, inspect the data
set.seed(12345)
train <- sample(1:nrow(iris),size=nrow(iris)*0.5)
irisTrain <- iris[train,]
irisTest <- iris[-train,]
# inspect properties of the original data
plot(irisTrain, col=irisTrain$Species)
summary(irisTrain)
# create rbf generator
irisGenerator<- rbfDataGen(Species~.,irisTrain)
# use the generator to create new data
irisNew <- newdata(irisGenerator, size=200)
#inspect properties of the new data
plot(irisNew, col = irisNew$Species) #plot generated data
summary(irisNew)
|
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
Min. :4.400 Min. :2.200 Min. :1.000 Min. :0.10 setosa :27
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.20 versicolor:23
Median :5.700 Median :3.100 Median :4.200 Median :1.30 virginica :25
Mean :5.863 Mean :3.099 Mean :3.721 Mean :1.18
3rd Qu.:6.550 3rd Qu.:3.400 3rd Qu.:5.300 3rd Qu.:1.80
Max. :7.900 Max. :4.200 Max. :6.400 Max. :2.50
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.402 Min. :2.220 Min. :1.020 Min. :0.1038
1st Qu.:5.047 1st Qu.:2.755 1st Qu.:1.545 1st Qu.:0.2897
Median :5.823 Median :3.024 Median :4.214 Median :1.1556
Mean :5.856 Mean :3.032 Mean :3.689 Mean :1.1286
3rd Qu.:6.586 3rd Qu.:3.252 3rd Qu.:5.349 3rd Qu.:1.8957
Max. :7.716 Max. :4.192 Max. :6.396 Max. :2.4999
Species
setosa :72
versicolor:62
virginica :66
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.