e_step | R Documentation |
Calculates the expectation of class memberships, and imputes if missing values for a given dataset.
e_step(data, model_obj, start=0, nu = 1.0)
data |
A matrix or data frame such that rows correspond to observations and columns correspond to variables. Note that this function currently only works with multivariate data p > 1. |
start |
Start values in this context are only used for imputation. Non-missing values have their expectation of class memberships calculated directly.
If |
model_obj |
A gpcm_best, vgpcm_best, stpcm_best, ghpcm_best, and salpcm_best object class. |
nu |
deterministic annealing for the class membership E-step. |
This will only work on a dataset with the same dimension as estimated in the model. e_step
will also work for missing values, provided that there is at least one non-missing entry.
Returns a list with the following components:
X |
A matrix of the original dataset plus imputed values if applicable. |
origX |
A matrix of the original dataset including missing values. |
map |
A vector of integers indicating the maximum a posteriori classifications for the best model. |
z |
A matrix giving the raw values upon which |
row_tags |
If there were NAs in the original dataset, a vector of indices referencing the row of the imputed vectors is given. |
Nik Pocuca, Ryan P. Browne and Paul D. McNicholas.
Maintainer: Paul D. McNicholas <mcnicholas@math.mcmaster.ca>
Browne, R.P. and McNicholas, P.D. (2014). Estimating common principal components in high dimensions. Advances in Data Analysis and Classification 8(2), 217-226.
Zhou, H. and Lange, K. (2010). On the bumpy road to the dominant mode. Scandinavian Journal of Statistics 37, 612-631.
Celeux, G., Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition 28(5), 781-793.
## Not run:
# load dataset and perform model search.
data(x2)
data_in <- matrix(x2,ncol = 2)
mm <- mixture::gpcm(data = data_in,G = 1:7,
start = 0,
veo = FALSE,pprogress=FALSE)
# get best model
best = get_best_model(mm)
best
# lets try imputing some missing data.
x2NA <- x2
x2NA[5,1] <- NA
x2NA[140,2] <- NA
x2NA[99,1] <- NA
# calculate expectation
expect <- e_step(data=x2NA,start = 0,nu = 1.0,model_obj = best)
# plot imputed entries and compare with original
plot(x2,col = "grey")
points(expect$X[expect$row_tags+1,],col = "blue", pch = 20,cex = 2) # blue are imputed values.
points(x2[expect$row_tags+1,], col = "red" , pch = 20,cex = 2) # red are original values.
legend(-2,2,legend = c("imputed","original"),col = c("blue","red"),pch = 20)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.