fit2clusters: Flexible two-cluster mixture fit of a numeric vector

Description Usage Arguments Details Value

View source: R/fit2clusters.R

Description

fit2clusters uses an ECM algorithm to fit a two-component mixture model. It is more flexible than mclust in some ways, but it only deals with one-dimensional data.

Usage

1
2
3
4
5
6
7
8
  fit2clusters(Y, Ylabel = "correlation", Ysigsq,
    piStart = c(0.5, 0.5), VStart = c(0.1, 0.1),
    psiStart = c(0, 0.1), NinnerLoop = 1, nReps = 500,
    psi0Constraint, V0Constraint, sameV = FALSE,
    estimatesOnly = TRUE, plotMe = TRUE, testMe = FALSE,
    Ntest = 5000, simPsi = c(0, 0.4), simPi = c(2/3, 1/3),
    simV = c(0.05^2, 0.05^2), simAlpha = 5, simBeta = 400,
    seed, ...)

Arguments

Y

The vector of numbers to fit.

Ysigsq

The vector of variance estimates for Y.

Ylabel

Label for the Y axis in a density fit figure.

piStart

Starting values for the component proportions.

VStart

Starting values for the component variances.

psiStart

Starting values for the component means

NinnerLoop

Number of iterations in the "C" loop of ECM.

nReps

Upper limit of number of EM steps.

psi0Constraint

If not missing, a fixed value for the first component mean.

V0Constraint

If not missing, a fixed value for the first component variance.

sameV

If TRUE, the components have the same variance.

estimatesOnly

If TRUE, return only the estimates. Otherwise, returns details per observations, and return the estimates as an attribute.

plotMe

If TRUE, plot the mixture density and kernel smooth estimates.

testMe

If TRUE, run a code test.

Ntest

For testing purposes, the number of replications of simulated data.

simPsi

For testing purposes, the true means.

simPi

For testing purposes, the true proportions

simV

For testing purposes, the true variances.

simAlpha

For testing purposes, alpha parameter in rgamma for measurement error variance.

simBeta

For testing purposes, beta parameter in rgamma for measurement error variance.

seed

For testing purposes, random seed.

...

Not used; testing roxygen2.

Details

See the document "ECM_algorithm_for_two_clusters.pdf".

Value

If estimatesOnly is TRUE, return only the estimates: Otherwise, return a dataframe of details per observations, and return the estimates as an attribute. The estimates details are:

pi1

The probability of the 2nd mixture component

psi0

The mean of the first component (psi0Constraint if provided)

psi1

The mean of the second component

Var0

The variance of the first component (V0Constraint if provided)

Var1

The variance of the second component

The observations details are:

Y

The original observations.

Ysigsq

The original measurement variances.

posteriorOdds

Posterior odds of being in component 2 of the mixture.

postProbVar

Estimated variance of the posterior probability, using the delta method.


IdMappingAnalysis documentation built on Oct. 31, 2019, 3:30 a.m.