gold: Bayesian Density Estimation Using 'gold' (Gaussian process On...

Description Usage Arguments Details Value References Examples

Description

Estimates a density from Bayesian techniques using a Gaussian process on a log density.

Usage

1
gold(y, s1, c1, s2, c2, N = 20000, graves = FALSE, scale_l = 0.00001, scale_u = 0.00001, poi = NA)

Arguments

y

A numeric vector. gold estimates the density on this data.

s1

The standard deviation of the Gaussian prior.

c1

The correlation parameter of the Gaussian prior that controls the correlation in the covariance structure and smoothness in the density estimate.

s2

The standard deviation of the Gaussian proposal distribution.

c2

The correlation parameter that controls the covariance structure of the Gaussian proposal distribution.

N

The number of iterations to run in the algorithm. The default is 20,000.

graves

An option to have the standard deviation of the proposal distribution (s1) chosen using an automatic step size selection (See Graves (2011)) (DEFAULT is FALSE).

scale_l

A value >= 0 controlling the scaling based on the minimum data value. The default is 0.00001 (see details).

scale_u

value >= 0 controlling the scaling based on the maximum data value. The default is 0.00001 (see details).

poi

Points of interest at which to estimate the density. gold_pdf will produce estimates and credible intervals of the density at any point, but mat returned by gold_pdf and gold_cdf only return the posterior draws if the density is estimated at that point.

Details

The density being estimated takes the form below:

f(x) =

exp(g(x)) / (\int_{0}^{1} exp(g(u)) du)

where g(x) is an unknown log density.

Given that g(x) is unknown, the integral in the normalizing constant is estimated using a weighted average, and the set of unknown parameters that receive a prior is g(x) at a finite set of points. This finite set of points includes each data point, as well as a grid to ensure good coverage.

This density operates on data between 0 and 1. Thus, if the input is not between 0 and 1, it is standardized. The formula for scaling is below:

(y-(min(y)-scale_l))/(max(y)+scale_u-(min(y)-scale_l))

.

Values of 0 for the scale parameters indicates the density will only be the edges of the data in y. The default for scale_l and scale_u is 0.00001 since some densities cannot be estimated at 0 and 1.

Value

gold returns a list containing the density estimate results.

G

A matrix with the posterior draws for g(x) at a finite number of points. The number of rows is the number of iterations. The number of columns is the number of data points plus the grid points.

widths

A vector with the widths around each x used for g(x). These are the widths used in the weighted average to estimate the integral in the normalizing constant.

x

A vector with the points at which g(x) was estimated.

y

A vector containing the data introduced to gold for density estimation.

ar

The acceptance rate from the random walk proposals.

prior

A vector of all prior and proposal parameter values.

poi

The points of interest if poi is non-missing.

References

Graves, T. (2011). Automatic Step Size Selection in Random Walk Metropolis Algorithms.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
## --------------------------------------------------------------------------------
## Beta Distribution
## --------------------------------------------------------------------------------

# First run 'gold' on data sampled from a Beta(5, 1) distribution with 200 data points.
y <- rbeta(200, 5, 1)
# Specify all prior parameters and tune s2 until a reasonable acceptance rate is achieved
gold_beta <- gold(y = y, s1 = 1, c1 = 2, s2 = 0.2, c2 = 0.5, N = 20000)

# Check trace plots
gold_mcmcplots(gold_beta)

# Examine the estimate of the PDF
gold_plot(gold_beta, data = TRUE)

# Examine the estimate of the CDF with the empirical CDF overlaied
gold_plot(gold_beta, type = "cdf", data = TRUE)

## --------------------------------------------------------------------------------
## Normal Distribution
## --------------------------------------------------------------------------------

# First run 'gold' on data sampled from a Normal(0, 1) distribution with 100 data points.
y <- rnorm(100, 0, 1)
# Specify all prior parameters except s2. Use graves instead to choose s2 to achieve a reasonable acceptance rate
# Set the upper and lower scaling parameters to the standard deviation of the data
gold_norm <- gold(y = y, s1 = 1, c1 = 1, c2 = 0.5, N = 20000, graves = TRUE, scale_l = sd(y), scale_u = sd(y))

# Check what s2 was chosen
gold_norm$prior

# Check acf plots
gold_mcmcplots(gold_norm, type = "acf")

# Examine the estimate of the PDF with credible intervals
gold_plot(gold_norm, cri = TRUE)

# Examine the estimate of the CDF using an interactive graph
gold_plot(gold_norm, type = "cdf", interact = TRUE)

## --------------------------------------------------------------------------------
## Bimodal Distribution
## --------------------------------------------------------------------------------

# Sample 400 random uniforms
u  <- runif(400)
y <- rep(NA,400)
# Sampling from the mixture
for(i in 1:400){
  if(u[i] < 0.3){
   y[i]  <-  rnorm(1, 0, 1)
  }else {
   y[i] <- rnorm(1, 4, 1)
  }
}

# First run 'gold' on data sampled from a bimodal distribution with 200 data points.
# Specify several points of interest
gold_bimodal <- gold(y = y, s1 = 1, c1 = 15, c2 = 7, graves = TRUE, poi = c(0, 6.5))

# Check the running mean plots
gold_mcmcplots(gold_bimodal, type = "rm")

# Plot the PDF with the data and credible intervals
gold_plot(gold_bimodal, data = TRUE, cri = TRUE)

# Plot the CDF
gold_plot(gold_bimodal, type = "cdf")

# Plot histograms at estimates of the CDF at the 'poi'
bimodal_cdf <- gold_cdf(x = c(0, 6.5), gold_bimodal)
bimodal_cdf$cdf
# x = 0
hist(bimodal_cdf$mat[,1])
# x = 6.5
hist(bimodal_cdf$mat[,2])

reykp/BEDr documentation built on May 28, 2019, 8:40 a.m.

Related to gold in reykp/BEDr...