sslGmmEM: Gaussian Mixture Model with an EM Algorithm

Description Usage Arguments Details Value Fields Author(s) References Examples

View source: R/SSL.R

Description

sslGmmEM implements Gaussian Mixture Model with an EM algorithm, and weights the unlabeled data by introducing lambda-EM technique.

Usage

1
sslGmmEM(xl, yl, xu, seed = 0, improvement = 1e-04, p = 0.3)

Arguments

xl

a n * p matrix or data.frame of labeled data

yl

a n * 1 integer vector of labels.

xu

a m * p matrix or data.frame of unlabeled data

seed

an integer specifying random number generation state for spliting labeled data into training set and cross-validation set.

improvement

numeric. Minimal allowed improvement of parameters.

p

percentage of labeled data are splitted into cross-validation set.

Details

sslGmmEM introduces unlabeled data into parameter estimation process. The weight lambda is chosen by cross-validation. The Gaussian Mixture Model is estimated based on maximum log likelihood function with an EM algorithm. The E-step computes the probabilities of each class for every observation. The M-step computes parameters based on probabilities obtained in the E-step.

Value

a list of values is returned:

Fields

para

a numeric estimated parameter matrix in which the column represents variables and the row represents estimated means and standard deviation of each class. for example, the first and second row represents the mean and standard deviation of the first class, the third and fourth row represents the mean and standard deviation of the second class,etc.

classProb

the estimated class probabilities

yu

the predicted label of unlabeled data

optLambda

the optimal lambda chosen by cross-validation

Author(s)

Junxiang Wang

References

Kamal Nigam, Andrew Mccallum, Sebastian Thrun, Tom Mitchell(1999) Text Classification from Labeled and Unlabeled Documents using EM

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
data(iris)
xl<-iris[,-5]
#Suppose we know the first twenty observations of each class
#and we want to predict the remaining with Gaussian Mixture Model
#1 setosa, 2 versicolor, 3 virginica
yl<-rep(1:3,each=20)
known.label <-c(1:20,51:70,101:120)
xu<-xl[-known.label,]
xl<-xl[known.label,]
l<-sslGmmEM(xl,yl,xu)

Example output



SSL documentation built on May 29, 2017, 7:14 p.m.

Related to sslGmmEM in SSL...