classifier.fit: Fit the supervised classifier under partition exchangeability
In PEkit: Partition Exchangeability Toolkit

Description Usage Arguments Details Value Examples

View source: R/Classifier.R

Fits the model according to training data x, where x is assumed to follow the Poisson-Dirichlet distribution, and discrete labels y.

1	classifier.fit(x, y)

`x`	data vector, or matrix with rows as data points and columns as features.
`y`	training data label vector of length equal to the amount of rows in `x`.

This function is used to learn the model parameters from the training data, and gather them into an object that is used by the classification algorithms tMarLab() and tSimLab(). The parameters it learns are the Maximum Likelihood Estimate of the ψ of each feature within each class in the training data. It also records the frequencies of the data for each feature within each class as well. These are used in calculating the predictive probability of each test data being in each of the classes.

Returns an object used as training data objects for the classification algorithms tMarLab() and tSimLab().

If x is multidimensional, each list described below is returned for each dimension.

Returns a list of classwise lists, each with components:

frequencies: the frequencies of values in the class.

psi: the Maximum Likelihood estimate of ψ for the class.

## Create training data x and its class labels y from Poisson-Dirichlet distributions
## with different psis:
set.seed(111)
x1<-rPD(5000,10)
x2<-rPD(5000,100)
x<-c(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)

## With multidimensional x:
set.seed(111)
x1<-cbind(rPD(5000,10),rPD(5000,50))
x2<-cbind(rPD(5000,100),rPD(5000,500))
x<-rbind(x1,x2)
y1<-rep("1", 5000)
y2<-rep("2", 5000)
y<-c(y1,y2)
fit<-classifier.fit(x,y)