MixNRMI2: Normalized Random Measures Mixture of Type II
In BNPdensity: Ferguson-Klass Type Algorithm for Posterior Normalized Random Measures

MixNRMI2

R Documentation

Normalized Random Measures Mixture of Type II

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations and scales.

Usage

MixNRMI2(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.py0 = "normal",
  distr.pz0 = "gamma",
  mu.pz0 = 3,
  sigma.pz0 = sqrt(10),
  delta_S = 4,
  kappa = 2,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

`x`	Numeric vector. Data set to which the density is fitted.
`probs`	Numeric vector. Desired quantiles of the density estimates.
`Alpha`	Numeric constant. Total mass of the centering measure. See details.
`Kappa`	Numeric positive constant. See details.
`Gama`	Numeric constant. `0 \leq Gama \leq 1`. See details.
`distr.k`	The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.
`distr.py0`	The distribution name for the centering measure for locations. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.
`distr.pz0`	The distribution name for the centering measure for scales. Allowed names are "gamma", or an integer number identifying the centering measure for scales: 2 = Gamma. For more options use `MixNRMI2cens`.
`mu.pz0`	Numeric constant. Prior mean of the centering measure for scales.
`sigma.pz0`	Numeric constant. Prior standard deviation of the centering measure for scales.
`delta_S`	Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.
`kappa`	Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.
`delta_U`	Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. If 'adaptive=TRUE', 'delta_U'is the starting value for the adaptation.
`Meps`	Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.
`Nx`	Integer constant. Number of grid points for the evaluation of the density estimate.
`Nit`	Integer constant. Number of MCMC iterations.
`Pbi`	Numeric constant. Burn-in period proportion of `Nit`.
`epsilon`	Numeric constant. Extension to the evaluation grid range. See details.
`printtime`	Logical. If TRUE, prints out the execution time.
`extras`	Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.
`adaptive`	Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for both, locations (means) and standard deviations, of the mixture kernel, leading to a fully nonparametric mixture model.

The details of the model are:

X_i|Y_i,Z_i \sim k(\cdot|Y_i,Z_i)

(Y_i,Z_i)|P \sim P, i=1,\dots,n

P \sim \textrm{NGG}(\texttt{Alpha, Kappa, Gama; P\_0})

where, X_i's are the observed data, (Y_i,Z_i)'s are bivariate latent (location and scale) vectors, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with a bivariate P_0 being the centering measure with independent components, that is, P_0(Y,Z) = P_0(Y)*P_0(Z). The parameters of P_0(Y) are assigned vague hyper prior distributions and (mu.pz0,sigma.pz0) are the hyper-parameters of P_0(Z). In particular, NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2;P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process. The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a list with the following components:

`xx`	Numeric vector. Evaluation grid.
`qx`	Numeric array. Matrix of dimension `\texttt{Nx} \times (\texttt{length(probs)} + 1)` with the posterior mean and the desired quantiles input in `probs`.
`cpo`	Numeric vector of `length(x)` with conditional predictive ordinates.
`R`	Numeric vector of `length(Nit*(1-Pbi))` with the number of mixtures components (clusters).
`U`	Numeric vector of `length(Nit*(1-Pbi))` with the values of the latent variable U.
`Allocs`	List of `length(Nit*(1-Pbi))` with the clustering allocations.
`means`	List of `length(Nit*(1-Pbi))` with the cluster means (locations). Only if extras = TRUE.
`sigmas`	Numeric vector of `length(Nit*(1-Pbi))` with the cluster standard deviations. Only if extras = TRUE.
`weights`	List of `length(Nit*(1-Pbi))` with the mixture weights. Only if extras = TRUE.
`Js`	List of `length(Nit*(1-Pbi))` with the unnormalized weights (jump sizes). Only if extras = TRUE.
`Nm`	Integer constant. Number of jumps of the continuous component of the unnormalized process.
`delta_Us`	List of `length(Nit*(1-Pbi))` with the sequence of adapted delta_U used in the MH step for the latent variable U.
`Nx`	Integer constant. Number of grid points for the evaluation of the density estimate.
`Nit`	Integer constant. Number of MCMC iterations.
`Pbi`	Numeric constant. Burn-in period proportion of `Nit`.
`procTime`	Numeric vector with execution time provided by `proc.time` function.
`distr.k`	Integer corresponding to the kernel chosen for the mixture
`data`	Data used for the fit
`NRMI_params`	A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, Kon Kam King, G., E., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Arbel, J., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I. (2021). BNPdensity: a package for Bayesian Nonparametric density estimation using Normalised Random Measures with Independent Increments.. Australian and New Zealand Journal of Statistics, to appear

Examples

## Not run: 
### Example 1
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI2(x)
# Plotting density estimate + 95% credible interval
plot(out)

## End(Not run)

### Example 2
## Do not run
# set.seed(150520)
# data(enzyme)
# x <- enzyme
#  Enzyme2.out <- MixNRMI2(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
#                          distr.k = "gamma", distr.py0 = "gamma",
#                          distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1, Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)
# The output of this run is already loaded in the package
# To show results run the following
# Data
data(enzyme)
x <- enzyme
data(Enzyme2.out)
attach(Enzyme2.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme2.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
#  Galaxy2.out <- MixNRMI2(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
#                          distr.k = "normal", distr.py0 = "gamma",
#                          distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1,  Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)
# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy2.out)
attach(Galaxy2.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy2.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

BNPdensity documentation built on Aug. 8, 2025, 7:20 p.m.