Package Overview

Implements the Expectation Maximisation Algorithm for clustering the multivariate and univariate datasets. There are two versions of EM implemented-EM* (converge faster by avoiding revisiting the data) and EM. For more details on EM*, see the 'References' section below.

The package has been tested with both real and simulated datasets. The package comes bundled with a dataset for demonstration (ionosphere_data.csv). More help about the package can be seen by typing ?DCEM in the R console (after installing the package).

Currently, data imputation is not supported and user has to handle the missing data before using the package.


For any Bug Fixes/Feature Update(s)

[Parichit Sharma:]

For Reporting Issues


GitHub Repository Link

Github Repository

Installation Instructions

Dependencies First, install all the required packages as follows:

install.packages(c("matrixcalc", "mvtnorm", "MASS", "Rcpp"))

Installing from CRAN


Installing from the Source Package

R CMD install DCEM_2.0.3.tar.gz

How to use the Package (Example: Working with the default bundled dataset)

ionosphere_data = read.csv2(
  file = paste(trimws(getwd()),"/data/","ionosphere_data.csv",sep = ""),
  sep = ",",
  header = FALSE,
  stringsAsFactors = FALSE

Paste the below code in the R session to clean the dataset.

ionosphere_data =  trim_data("35,2", ionosphere_data)

Paste the below code in the R session to call the dcem_train() function.

dcem_out = dcem_train(data = ionosphere_data, threshold = 0.0001, iteration_count = 50, num_clusters = 2)
          [1] Posterior Probabilities: dcem_out$prob: A matrix of posterior-probabilities for the 
              points in the dataset.

          [2] Meu(s): dcem_out$meu

              For multivariate data: It is a matrix of meu(s). Each row in the  
              matrix corresponds to one meu.

              For univariate data: It is a vector if meu(s). Each element of the vector corresponds 
              to one meu.

          [3] Co-variance matrices 

              For multivariate data: dcem_out$sigma: List of co-variance matrices.

              For univariate data: dcem_out$sigma: Vector of standard deviation(s).

          [4] Priors: dcem_out$prior: A vector of prior.

          [5] Membership: dcem_out$membership: A vector of cluster membership for data.

How to access the help (after installing the package)


Try the DCEM package in your browser

Any scripts or data that you put into this service are public.

DCEM documentation built on Aug. 2, 2020, 9:07 a.m.