upclassifymodel: Updated Classification Method using Labeled and Unlabeled...
In upclass: Updated Classification Methods using Unlabeled Data

Description Usage Arguments Details Value Author(s) References See Also Examples

This function implements the EM algorithm by iterating over the E-step and M-step. The initial values are obtained from the labeled data then both steps are further iterated over the complete data, labeled and unlabeled data combined.

1
2
3

upclassifymodel(Xtrain, cltrain, Xtest, cltest = NULL,
modelName = "EEE", tol = 10^-5, iterlim = 1000, 
Aitken = TRUE, ...)

`Xtrain`	A numeric matrix of observations where rows correspond to observations and columns correspond to variables. The group membership of each observation is known - labeled data.
`cltrain`	A numeric vector with distinct entries representing a classification of the corresponding observations in `Xtrain`.
`Xtest`	A numeric matrix of observations where rows correspond to observations and columns correspond to variables. The group membership of each observation may not be known - unlabeled data.
`cltest`	A numeric vector with distinct entries representing a classification of the corresponding observations in `Xtest`. By default, these are not supplied and the function sets out to obtain them.
`modelName`	A character string indicating the model, with default "EEE". The models available for selection are described in `modelvec`
`tol`	A positive number, with default `10^{-5}`, which is a measure of how strictly convergence is defined.
`iterlim`	A positive integer, with default 1000, which is the desired limit on the maximum number of iterations.
`Aitken`	A logical value with default `TRUE` which tests for convergence using Aitken acceleration. If value is set to `FALSE`, convergence is tested by comparing `tol` to the change in log-likelihood between two consecutive iterations. For further information on Aitken acceleration, see `Aitken`
`...`	Arguments passed to or from other methods.

This is an updated approach to typical classification methods. Initially, the M-step is performed on the labeled (training) data to obtain parameter estimates for the model. These are used in an E-step to obtain group memberships for the unlabeled (test) data. The training data labels and new probability estimates for test data labels are combined to form the complete data. From here, the M-step and E-step are iterated over the complete data, with continuous updating until convergence has been reached. This has been shown to result in lower misclassification rates, particularly in cases where only a small proportion of the total data is labeled.

The return value is a list with the following components:

`call`	The function call from `upclassifymodel`.
`Ntrain`	The number of observations in the training data.
`Ntest`	The number of observations in the test data.
`d`	The dimension of the data.
`G`	The number of groups in the data
`iter`	The number of iterations required to reach convergence. If convergence was not obtained, this is equal to `iterlim`.
`converged`	A logical value where `TRUE` indicates convergence was reached and `FALSE` means `iter` reached `iterlim` without obtaining convergence.
`modelName`	A character string identifying the model (same as the input argument).
`parameters pro`	A vector whose kth component is the mixing proportion for the kth component of the mixture model. If the model includes a Poisson term for noise, there should be one more mixing proportion than the number of Gaussian components.
`mean`	The mean for each component. If there is more than one component, this is a matrix whose kth column is the mean of the kth component of the mixture model.
`variance`	A list of variance parameters for the model. The components of this list depend on the model specification.
`train/test z`	A matrix whose `[i,k]`th entry is the conditional probability of the ith observation belonging to the kth component of the mixture.
`cl`	A numeric vector with distinct entries representing a classification of the corresponding observations in `Xtrain`/`Xtest`.
`rate`	The number of misclassified observations.
`Brierscore`	The Brier score measuring the accuracy of the probabilities (`z`s) obtained.
`tab`	A table of actual and predicted group classifications.
`ll`	The log-likelihood for the data in the mixture model.
`bic`	The Bayesian Information Criterion for the model.

Niamh Russell

C. Fraley and A.E. Raftery (2002). Model based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631.

Fraley, C. and Raftery, A.E. (2006). MCLUST Version for R: Normal Mixture Modeling and Model-Based Clustering, Technical Report no. 504, Department of Statistics, University of Washington.

Dean, N., Murphy, T.B. and Downey, G (2006). Using unlabelled data to update classification rules with applications in food authenticity studies. Journal of the royal Statistical Society: Series C 55 (1), 1-14.

upclassify, Aitken, modelvec

# This function is not designed to be used on its own, 
# but to be called by \code{upclassify}
data(wine, package = "gclus")
X <- as.matrix(wine[, -1])
cl <- unclass(wine[, 1])
indtrain <- sort(sample(1:178, 120))
indtest <- setdiff(1:178, indtrain)

fitup <- upclassifymodel(X[indtrain,], cl[indtrain], X[indtest,], cl[indtest])

Loading required package: mclust
Package 'mclust' version 5.4.3
Type 'citation("mclust")' for citing this R package in publications.

upclass documentation built on May 29, 2017, 5:12 p.m.

upclass index

Package overview Using upclass

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

upclass
Updated Classification Methods using Unlabeled Data

upclassifymodel: Updated Classification Method using Labeled and Unlabeled...
In upclass: Updated Classification Methods using Unlabeled Data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to upclassifymodel in upclass...

R Package Documentation

Browse R Packages

We want your feedback!

upclass Updated Classification Methods using Unlabeled Data

upclassifymodel: Updated Classification Method using Labeled and Unlabeled... In upclass: Updated Classification Methods using Unlabeled Data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to upclassifymodel in upclass...

R Package Documentation

Browse R Packages

We want your feedback!

upclass
Updated Classification Methods using Unlabeled Data

upclassifymodel: Updated Classification Method using Labeled and Unlabeled...
In upclass: Updated Classification Methods using Unlabeled Data