dcem_predict: dcem_predict: Part of DCEM package.

Description Usage Arguments Value References Examples

View source: R/dcem_predict.R

Description

Predict the cluster membership of test data based on the learned parameters i.e, output from dcem_train or dcem_star_train.

Usage

1
dcem_predict(param_list, data)

Arguments

param_list

(list): List of distribution parameters. The list contains the learned parameteres of the distribution.

data

(vector or dataframe): A vector of data for univariate data. A dataframe (rows represent the data and columns represent the features) for multivariate data.

Value

A list containing the cluster membership for the test data.

References

Parichit Sharma, Hasan Kurban, Mehmet Dalkilic DCEM: An R package for clustering big data via data-centric modification of Expectation Maximization, SoftwareX, 17, 100944 URL https://doi.org/10.1016/j.softx.2021.100944

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Simulating a mixture of univariate samples from three distributions
# with meu as 20, 70 and 100 and standard deviation as 10, 100 and 40 respectively.
sample_uv_data = as.data.frame(c(rnorm(100, 20, 5), rnorm(70, 70, 1), rnorm(50, 100, 2)))

# Select first few points from each distribution as test data
test_data = as.vector(sample_uv_data[c(1:5, 101:105, 171:175),])

# Remove the test data from the training set
sample_uv_data = as.data.frame(sample_uv_data[-c(1:5, 101:105, 171:175), ])

# Randomly shuffle the samples.
sample_uv_data = as.data.frame(sample_uv_data[sample(nrow(sample_uv_data)),])

# Calling the dcem_train() function on the simulated data with threshold of
# 0.000001, iteration count of 1000 and random seeding respectively.
sample_uv_out = dcem_train(sample_uv_data, num_clusters = 3, iteration_count = 100,
threshold = 0.001)

# Predict the membership for test data
test_data_membership <- dcem_predict(sample_uv_out, test_data)

# Access the output
print(test_data_membership)

parichit/DCEM documentation built on Jan. 22, 2022, 6:13 p.m.