MixNRMI1: Normalized Random Measures Mixture of Type I

View source: R/MixNRMI1.R

MixNRMI1R Documentation

Normalized Random Measures Mixture of Type I

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.

Usage

MixNRMI1(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = 1,
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.p0

The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.

The details of the model are:

X_i|Y_i,\sigma \sim k(\cdot |Y_i,\sigma)

Y_i|P \sim P,\quad i=1,\dots,n

P \sim \textrm{NGG(\texttt{Alpha, Kappa, Gama; P\_0})}

\sigma \sim \textrm{Gamma(asigma, bsigma)}

where X_i's are the observed data, Y_i's are latent (location) variables, sigma is the smoothing parameter, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with P_0 being the centering measure whose parameters are assigned vague hyper prior distributions, and (asigma,bsigma) are the hyper-parameters of the gamma prior on the smoothing parameter sigma. In particular: NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2; P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process.

The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a MixNRMI1 object. It is based on a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

S

Numeric vector of length(Nit*(1-Pbi)) with the values of common standard deviation sigma.

U

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Js

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Nm

Integer constant. Number of jumps of the continuous component of the unnormalized process.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, E., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


### Example 1
## Not run: 
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1(x)
# Plotting density estimate + 95% credible interval
plot(out)
### Example 2
set.seed(150520)
data(enzyme)
x <- enzyme
Enzyme1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
                         distr.k = "gamma", distr.p0 = "gamma",
                         asigma = 1, bsigma = 1, Meps=0.005,
                         Nit = 5000, Pbi = 0.2)
attach(Enzyme1.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

## End(Not run)

### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
#  Galaxy1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
#                          distr.k = "normal", distr.p0 = "gamma",
#                          asigma = 1, bsigma = 1, delta = 7, Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)

# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy1.out)
attach(Galaxy1.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

BNPdensity documentation built on April 1, 2023, 12:10 a.m.