moSN: Finite Mixture of Spherical Normal Distributions

View source: R/mixture_sphere_SN.R

moSNR Documentation

Finite Mixture of Spherical Normal Distributions

Description

For n observations on a (p-1) sphere in \mathbf{R}^p, a finite mixture model is fitted whose components are spherical normal distributions via the following model

f(x; ≤ft\lbrace w_k, μ_k, λ_k \right\rbrace_{k=1}^K) = ∑_{k=1}^K w_k SN(x; μ_k, λ_k)

with parameters w_k's for component weights, μ_k's for component locations, and λ_k's for component concentrations.

Usage

moSN(
  data,
  k = 2,
  same.lambda = FALSE,
  variants = c("soft", "hard", "stochastic"),
  ...
)

## S3 method for class 'moSN'
loglkd(object, newdata)

## S3 method for class 'moSN'
label(object, newdata)

## S3 method for class 'moSN'
density(object, newdata)

Arguments

data

data vectors in form of either an (n\times p) matrix or a length-n list. See wrap.sphere for descriptions on supported input types.

k

the number of clusters (default: 2).

same.lambda

a logical; TRUE to use same concentration parameter across all components, or FALSE otherwise.

variants

type of the class assignment methods, one of "soft","hard", and "stochastic".

...

extra parameters including

maxiter

the maximum number of iterations (default: 50).

eps

stopping criterion for the EM algorithm (default: 1e-6).

printer

a logical; TRUE to show history of the algorithm, FALSE otherwise.

object

a fitted moSN model from the moSN function.

newdata

data vectors in form of either an (m\times p) matrix or a length-m list. See wrap.sphere for descriptions on supported input types.

Value

a named list of S3 class riemmix containing

cluster

a length-n vector of class labels (from 1:k).

loglkd

log likelihood of the fitted model.

criteria

a vector of information criteria.

parameters

a list containing proportion, center, and concentration. See the section for more details.

membership

an (n\times k) row-stochastic matrix of membership.

Parameters of the fitted model

A fitted model is characterized by three parameters. For k-mixture model on a (p-1) sphere in \mathbf{R}^p, (1) proportion is a length-k vector of component weight that sums to 1, (2) center is an (k\times p) matrix whose rows are cluster centers, and (3) concentration is a length-k vector of concentration parameters for each component.

Note on S3 methods

There are three S3 methods; loglkd, label, and density. Given a random sample of size m as newdata, (1) loglkd returns a scalar value of the computed log-likelihood, (2) label returns a length-m vector of cluster assignments, and (3) density evaluates densities of every observation according ot the model fit.

References

\insertRef

you_2022_ParameterEstimationModelbasedRiemann

Examples


# ---------------------------------------------------- #
#                 FITTING THE MODEL
# ---------------------------------------------------- #
# Load the 'city' data and wrap as 'riemobj'
data(cities)
locations = cities$cartesian
embed2    = array(0,c(60,2)) 
for (i in 1:60){
   embed2[i,] = sphere.xyz2geo(locations[i,])
}

# Fit the model with different numbers of clusters
k2 = moSN(locations, k=2)
k3 = moSN(locations, k=3)
k4 = moSN(locations, k=4)

# Visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(embed2, col=k2$cluster, pch=19, main="K=2")
plot(embed2, col=k3$cluster, pch=19, main="K=3")
plot(embed2, col=k4$cluster, pch=19, main="K=4")
par(opar)

# ---------------------------------------------------- #
#                   USE S3 METHODS
# ---------------------------------------------------- #
# Use the same 'locations' data as new data 
# (1) log-likelihood
newloglkd = round(loglkd(k3, locations), 3)
print(paste0("Log-likelihood for K=3 model fit : ", newloglkd))

# (2) label
newlabel = label(k3, locations)

# (3) density
newdensity = density(k3, locations)



Riemann documentation built on March 18, 2022, 7:55 p.m.