fitMixDist  R Documentation 
This function performs the nonlinear fit of mixture
distributions exploiting a firth approach on parameterized finite
Gaussian mixture models obtained through the function
Mclust
from package mclust.
fitMixDist(
X,
args = list(norm = c(mean = NA, sd = NA), weibull = c(shape = NA, scale = NA)),
dens = TRUE,
npoints = NULL,
kmean = FALSE,
maxiter = 1024,
prior = priorControl(),
ftol = 1e14,
ptol = 1e14,
maxfev = 1e+05,
equalPro = FALSE,
eps = .Machine$double.eps,
tol = c(1e05, sqrt(.Machine$double.eps)),
usepoints,
iter.max = 10,
nstart = 1,
algorithm = c("HartiganWong", "Lloyd", "Forgy", "MacQueen"),
seed = 123,
verbose = TRUE,
...
)
X 
numerical vector. It is user responsability to provide 'X' values that belong the definition domain of the functions from mixture distribution. 
dens 
Logic. Whether to use fit the 'PDF' or 'CDF'. Default is TRUE. 
npoints 
number of points used in the fit of the density function or
NULL. These are used as histogram break points to estimate the empirical
density values. If npoints = NULL and dens = TRUE, then.
Kernel Density Estimation function 
kmean 
Logic. Whether to use 
maxiter 
positive integer. Termination occurs when the number of iterations reaches maxiter. Default value: 1024. 
prior 
Same as in 
ftol 
nonnegative numeric. Termination occurs when both the actual and predicted relative reductions in the sum of squares are at most ftol. Therefore, ftol measures the relative error desired in the sum of squares. Default value: 1e12 
ptol 
nonnegative numeric. Termination occurs when the relative error between two consecutive iterates is at most ptol. Therefore, ptol measures the relative error desired in the approximate solution. Default value: 1e12. 
maxfev 
Integer; termination occurs when the number of calls to fn has reached maxfev. Note that nls.lm sets the value of maxfev to 100*(length(par) + 1) if maxfev = integer(), where par is the list or vector of parameters to be optimized. 
equalPro 
An argument to pass to 
eps, tol 
Arguments to pass to 
usepoints 
Integer. Computation by function

iter.max, nstart, algorithm 
Same as in 
seed 
Seed for random number generation. 
verbose 
if TRUE, prints the function log to stdout and a progress bar 
... 
Further arguments to pass to other functions like

arg 
A list of named vectors with the corresponding named distribution
parameters values. The names of the vector of parameters and the
parameter names must correspond to defined functions. For example, if
one of the involved distributions is the gamma density (see
Notice that the distribution given names correspond to the rootnames as
given for R functions. For example, 'gamma' is the rootname for
functions 
fit.comp 
Logical. If FALSE, then starting parameter values for each
mixture component will be estimated suing function 
The approch tries to fit the proposed mixture distributions using a
modification of LevenbergMarquardt algorithm implemented in function
nls.lm
from minpack.lm package that is
used to perform the nonlinear fit. Crossvalidations for the nonlinear
regressions (R.Cross.val) are performed as described in reference [1]. In
addition, Stein's formula for adjusted R squared (rho) was used as an
estimator of the average crossvalidation predictive power [1]. Notice
that the parameter values must be given in way understandable
by the set of functions mixtdistr
(see the example below).
It is user responsability to provide 'X' values that belong the
definition domain of the functions from mixture distribution.
A list with the model table with coefficients and goodnessoffit
results, the fitted model returned by function
nls.lm
, and a named list of fitted arguments.
Robersy Sanchez (https://genomaths.com).
1. Stevens JP. Applied Multivariate Statistics for the Social Sciences. Fifth Edit. Routledge Academic; 2009.
fitdistr
, fitCDF
,
mixtdistr
, and mcgoftest
.
#'
set.seed(1) # set a seed for random generation
## ========= A mixture of three distributions =========
phi < c(6 / 10, 4 / 10) #' Mixture proportions
## 
## === Named vector of the corresponding distribution function parameters
## must be provided
args < list(
gamma = c(shape = 2, scale = 0.1),
weibull = c(shape = 3, scale = 0.5)
)
## 
## ===== Sampling from the specified mixture distribution ====
X < rmixtdistr(n = 1e5, phi = phi, arg = args)
## 
## ===== Nonlinear fit of the specified mixture distribution ====
FIT < fitMixDist(X, args = list(
gamma = c(shape = NA, scale = NA),
weibull = c(shape = NA, scale = NA)
))
## === The graphics for the simulated dataset and the corresponding
## theoretical mixture distribution.
par(bg = "gray98", mar = c(3, 4, 2, 1))
hist(X, 90,
freq = FALSE, las = 1, ylim = c(0, 5), xlim = c(0, 1),
panel.first = {
points(0, 0, pch = 16, cex = 1e6, col = "grey95")
grid(col = "white", lty = 1)
},
family = "serif", col = rgb(0, 0, 1, 0.),
border = "deepskyblue", main = "Histogram of Mixture Distribution"
)
x1 < seq(4, 10, by = 0.001)
lines(x1, dmixtdistr(x1, phi = phi, arg = args), col = "red")
lines(x1, dmixtdistr(x1, phi = FIT$phi, arg = FIT$args), col = "blue")
legend(0.5, 4,
legend = c(
"Theoretical Mixture PDF",
"Estimated Mixture PDF",
"Gamma distribution component",
"Weibull distribution component"
),
col = c("red", "blue", "magenta", "brown"), lty = 1, cex = 0.8
)
## The standard definition of dgamma function (see ?dgamma)
lines(x1, dgamma(x1,
shape = FIT$args$gamma[1],
scale = FIT$args$gamma[2]
), col = "magenta")
## The standard definition of dgamma function (see ?dgamma)
lines(x1, dgamma(x1,
shape = args$gamma[1],
scale = args$gamma[2]
), col = "brown")
## The accuracy of the fitting depends on the the starting values
FIT < fitMixDist(X, args = list(
gamma = c(shape = 2.3, scale = 0.12),
weibull = c(shape = 2.5, scale = 0.4)
))
FIT$args
FIT$phi
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.