duos: duos (Distribution of Uniform Order Statistics)

Description Usage Arguments Details Value Examples

Description

Runs Gibbs sampling algorithm to perform Bayesian density estimation through duos.

Usage

1
duos(y, k = ceiling(n/50) + 3, N = 20000, alpha = 1, scale_l = 0.00001, scale_u = 0.00001, start = NA)

Arguments

y

A numeric vector. duos estimates the density on this data.

k

The number of cut-points for the density estimate. There is a default based on the size of y (see details).

N

The number of iterations to run in the algorithm. The default is 20,000.

alpha

The parameter values for the Dirichlet prior. The parameter is constant and set to 1 by default.

scale_l

A value >= 0 controlling the scaling based on the minimum data value. The default is 0.00001 (see details).

scale_u

A value >= 0 controlling the scaling based on the maximum data value. The default is 0.00001 (see details).

start

A vector of starting values of length k. This is useful for creating output to use in Gelman and Rubin's convergence diagnostics (see examples). If NA, reasonable starting values are selected by the algorithm.

Details

The density being estimated takes the form below:

f(x) =

(π_1) / (γ_1) , 0 ≤ x < γ_1

(π_2) / (γ_2-γ_1) , γ_1 ≤ x < γ_2

(π_3) / (γ_3-γ_2) , γ_2 ≤ x < γ_3

... , ... ≤ x < ...

(π_k) / (γ_k-γ_{k-1}) , γ_{k-1} ≤ x < γ_k

(π_{k+1}) / (1-γ_k) , γ_k ≤ x < 1

where γ_1 < γ_2 < ... < γ_k is in (0,1) and π_1 + π_2 + ... + π_{k+1} = 1

The prior on the γ's is: draw k values from $unif(0,1)$ and order: $0 < \gamma_1 < \gamma_2 < ... < \gamma_k-1 < \gamma_k < 1$. The prior on the π's is Dirichlet(α).

This density operates on data between 0 and 1. Thus, if the input is not between 0 and 1, it is standardized. The formula for scaling is below:

(y-(min(y)-scale_l))/(max(y)+scale_u-(min(y)-scale_l))

.

Values of 0 for the scale parameters indicates the density will only be the edges of the data in y. The default for scale_l and scale_u is 0.0001.

The recommended number of cut-points starts at 3, and then for each increment of 50, adds a cut-point.

Default: k = floor(n/50)+3

Value

duos returns a list of density estimate results.

C

A matrix containing the posterior draws for the cut-point parameters. The number of columns is k, and the number of rows is N.

P

A matrix containing the posterior draws for the bin proportion parameters. The number of columns is k+1, and the number of rows is N.

y

A vector containing the data introduced to duos for density estimation.

k

The number of cut-points.

alpha

The parameter for the prior on the bin proportions.

scale_l

The scaling parameter for the lower end of the data.

scale_u

The scaling parameter for the upper end of the data.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
## --------------------------------------------------------------------------------
## Beta Distribution
## --------------------------------------------------------------------------------

# Run 'duos' on data sampled from a Beta(5, 1) distribution with 100 data points.
# All defaults are used.
y <- rbeta(100, 5, 1)
duos_beta <- duos(y)

#Examine estimate of PDF
duos_plot(duos_beta, type = "pdf", data = TRUE)

#Examine estimate of CDF with credible intervals
duos_plot(duos_beta, type = "cdf")

#Find probability of being less than 0.1
duos_cdf(c(.1), duos_beta)$cdf

## --------------------------------------------------------------------------------
## Claw Distribution
## --------------------------------------------------------------------------------
# Run 'duos' on data sampled from a 'Claw' distribution with 200 data points.
u <- runif(200)

# Variable to store the samples from the mixture distribution 
y <- rep(NA, 200)
# Sampling from the mixture
for(i in 1:200){
if(u[i] < 0.5){
     y[i] <- rnorm(1, 0, 1)
     }else if(u[i] < 0.6){
     y[i] <- rnorm(1, -1, 0.1)
     }else if (u[i] < 0.7){
     y[i] <- rnorm(1, -0.5, 0.1)
     }else if (u[i] < 0.8){
     y[i] <- rnorm(1, 0, 0.1)
     }else if (u[i] <- 0.9){
     y[i] <- rnorm(1, 0.5, 0.1)
     }else{
     y[i] <- rnorm(1, 1, 0.1)
     }
     }
     
# Choose 12 cutpoints
duos_claw <- duos(y = y, k = 12, N = 20000)

# Examine estimate of PDF
duos_plot(duos_claw, type = "pdf", data = TRUE)

# Examine estimate of CDF
duos_plot(duos_claw, type = "cdf")

# Find probability of being greater than 90
1-duos_cdf(c(0.90), duos_claw)$cdf

## --------------------------------------------------------------------------------
## Normal Distribution
## --------------------------------------------------------------------------------

# Run 'duos' on data sampled from a Normal(0, 1) distribution with 50 data points.
y <- rnorm(50, 0, 1)
# Run for 15,000 iterations and change the 'scale' values to 1.5*sd(y)
duos_norm <- duos(y = y, scale_l = 1.5*sd(y), scale_u = 1.5*sd(y), N = 15000)

# Examine estimate of PDF
duos_plot(duos_norm, type = "pdf", data = TRUE)

# Examine estimate of CDF
duos_plot(duos_norm, type = "cdf", cri = TRUE)

# Find probability of being greater than 2.5
1-duos_cdf(c(2.5), duos_norm)$cdf

## --------------------------------------------------------------------------------
## Uniform Distribution
## --------------------------------------------------------------------------------

# Below is an example of using the 'start' variable to calculate the Gelman and Rubin diagnostic

# First run 'duos' on data sampled from a Uniform(0, 1) distribution with 50 data points.
y <- runif(50)

# Use 4 cutpoints
# Create three runs with diverse starting values
duos_unif1 <- duos(y = y, k = 4, N = 20000, start = runif(4, 0, 1/3))
duos_unif2 <- duos(y = y, k = 4, N = 20000, start = runif(4, 1/3, 2/3))
duos_unif3 <- duos(y = y, k = 4, N = 20000, start = runif(4, 2/3, 1))

# Load package coda
library(coda)

# Turn matrices of cut-points into mcmc objects
C1 <- mcmc(duos_unif1$C)
C2 <- mcmc(duos_unif2$C)
C3 <- mcmc(duos_unif3$C)

# Turn into mcmc list
all_chains <- mcmc.list(C1, C2, C3)

#Get Gelman and Rubin diagnostic
gelman.diag(all_chains)[[1]][,1]

#Examine estimate of PDF's from each run
duos_plot(duos_unif1, type="pdf", data=TRUE)
duos_plot(duos_unif2, type="pdf", data=TRUE)
duos_plot(duos_unif3, type="pdf", data=TRUE)

reykp/biRd documentation built on May 17, 2019, 8:16 p.m.

Related to duos in reykp/biRd...