knitr::opts_chunk$set(echo = TRUE)

Overview

The primary objective of the R Package is implementation of Probit Regression for Binary Responses via data augmentation and Gibbs sampling and its extention in modelling Ordered Multinomial responses, as discussed in ”Bayesian Analysis of Binary and Polychotomous Response Data” [Author(s): James H. Albert and Siddhartha Chib, Source: Journal of the American Statistical Association].

Probit Regression is used to model ordinal binary responses with both continuous and discrete variables as covariates. In data augmentation technique, given the binary responses we generate values from a underlying latent normal distribution and prior structures are assumed on the model parameters. Next the posterior distribution of the model parameters are derived and parameters estimation is carried out by a Gibbs Sampler algorithm using the full conditional distributions. The technique can be extended to model multinomial ordinal responses as well where the cut-off points on the latent variable axis also become model parameters and are updated at each iteration of the Gibbs Sampler algorithm. An attempt is made to put these two Gibbs Sampler algorithms in a R Package, along with the diagnostics plots i.e trace-plots to study the convergences and density plots of the posterior distributions. The flexibility of choices on the prior specification, number of iterations, burn in periods etc is be maintained.

In biomedical sciences, finance, astronomy etc researchers and practitioners often face problems of classifying objects of interest in two or more ordinal classes. In Wireless sensor networks it’s often more energy efficient to observe ordinal categorical or semi-continuous responses rather than recording continuous responses and data augmentation technique is often used to estimate the model parameters. The R package is envisioned to serve the purpose of modelling in such scenarios.

The functionalities of the package is discussed along with simulated examples in the following two sections: A. Bayesian Probit Regression via Data Augmentation and Gibbs Sampler Algorithm. B. Bayesian Ordered Multinomial Regression via Data Augmentation and Gibbs Sampler.

A. Bayesian Probit Regression via Data Augmentation and Gibbs Sampler Algorithm

Suppose $N$ independent binary random variables $Y_1, Y_2, ..., Y_N$ are observed, where $Y_i$ is distributed as Bernoulli($p_i$). The $p_i$s are related to a set of covariates that may be discrete or continuous. Define the binary regression model as $p_i = H(x_i^T\beta)$ for $i = 1,2,..,N$ where $\beta$ is a vector of $k$ unknown parameters, $x_i^T$ is a vector of known covariates and $H$ is a known cdf linking the probabilities $p_i$ with the linear structure $x_i^T\beta$.

Let $H = \Phi$, leading to probit model. Introduce $N$ independent latent variables where $Z_1, Z_2,..., Z_n$ where $Z_i$ follows $N(x_i^T\beta)$, and define $Y_i = 1$ if $Z_i > 0$ and $Y_i = 0$ otherwise. Then $Y_i$s become independent bernoulli random variables with $p_i = \Phi(x_i^T\beta)$. Define the design matrix $X = [x_1^T, x_2^T, ..., x_N^T]^T$.

The joint posterior density of unobservables $\beta$ and $Z = (Z_1 , Z_2, ..., Z_N)$ given the data $y = (y_1 , y_2, ..., y_N)$ is given by

$$ \pi(\beta,Z| y) = C\pi(\beta)\prod_{i=1}^{N}{((1_{Z_i}>0)(1_{y_i}=1) + (1_{Z_i}<0)(1_{y_i}=0)) \phi(Z_i,x_i^T\beta,1)} $$ If the prior distribution of $\beta$ is diffuse, then $\beta|y,Z$ is distributed as $N_k(\hat\beta_z,(X^TX)^{-1})$ where $\hat\beta_z = (X^TX)^{-1}(X^TZ)$.The random variables $Z_1, Z_2, ... , Z_N$ are independent such that $Z_i|y,\beta$ is distributed as $N(x_i^T\hat\beta,1)$ truncated at left by 0 if $y_i =1$ and as $N(x_i^T\hat\beta,1)$ truncated at right by 0 if $y_i =0$.

If $\beta$ is assigned the proper conjugate prior $N_k(\beta{^}, B^{})$ then $\beta|y,Z$ is distributed as $N_k(\tilde\beta_z,\tilde B)$ where $\tilde\beta_z = (B^{-1} + X^TX)^{-1}(B^{-1}\beta^{} + X^TZ)$ and $\tilde B =( B^{-1} + X^TX)^{-1}$. The random variables $Z_1, Z_2, ... , Z_N$ are independent such that $Z_i|y,\beta$ is distributed as $N(x_i^T\tilde\beta,1)$ truncated at left by 0 if $y_i =1$ and as $N(x_i^T\tilde\beta,1)$ truncated at right by 0 if $y_i =0$.

A.1 Data Generation & Fitting Binary Probit Regression using Gibbs Sampler Algorithm

First we create a simulated data set of n observations each with a binary response ( 0 or 1) and coressponding vlues on p covariates, to demonstrate the utility of the package functionalities. (Please note that ,the data generated in this section will be used throughout this article.) Then we use "BinaryGibbs_fit" function to implement Probit Regression for Binary Responses via data augmentation and Gibbs sampling.

# Generating Simulated Data 
set.seed(250)
require(PolyGibbs)
require(truncnorm)
require(MASS)
N <- 500
x1 <- seq(-1, 1, length.out = N)
x2 <- rnorm(N, 0, 1)
D <- 3
X <- matrix(c(rep(1, N), x1, x2), ncol = D)
true_theta <- c(- 0.5, 3.3, 2)
p <- pnorm(X %*% true_theta)
y <- rbinom(N, 1, p)
#Spliting The Data in Train and Test in 80:20 ratio
Train_ID = sample(1:nrow(X), round(nrow(X) * 0.8), replace = FALSE) # Train Data IDS
Train_X = X[Train_ID, -1] # Train Data Covariates
Test_X = X[-Train_ID, -1] # Test Data Covarites
Train_Y = y[Train_ID] # Train Data Response
Test_Y = y[-Train_ID] # Test Data Response
nIter = 10000
burn_in = round(nIter * 0.5)
prior = 2
prior_mean = rep(1, 3)
prior_var = diag(10, 3)
# Fitting Bayesian Probit Regression
# Uneccesasy print messages are supressed to keep the vigenette tidy
invisible(capture.output(Result <- BinaryGibbs_fit(Train_X, Train_Y, nIter, prior, burn_in, prior_mean, prior_var )))
# Output of the tail.
list(estimates = Result$estimates,Train_Accuracy_tail = tail(Result$Train_Accuracy, n =10),beta_matrix_tail= tail(Result$beta_matrix, n = 10))
# The actual function outputs are estimates, Train_Accuracy, beta_matrix.

The model can be fitted using other choices of the prior specification on the regression parameters changing the function inputs.

A.2 Prediction via Fitted Bayesian Probit Model on Test Set

Once we have fitted the model, we need to test the predictive power of the model on a test set. Function "BinaryGibbs_Pred" is used for prediction using the fitted Bayesian Probit Model on the test set.

# Storing the outputs of the model fitting
estimates = Result$estimates
# Prediction using the fitted model
BinaryGibbs_Pred(estimates, Test_X)

A.3 Calculation of Test Set Accuracy for Bayesian Probit Regression Model

Once we have fitted the Bayesian Probit Regression model and found out the model predictions on a test set, we want to calculate the test set accuracy comparing the actual and predicted levels of the responses on the test set. This will let us comment on the efficacy of the fitted model for the purpose of prediction. BinaryGibbs_Test_Accuracy Calculates Accuracy of Prediction on Test Set for Bayesian Probit Regression.

# Storing the outputs of the model fitting and predictions
estimates = Result$estimates
Predicted_Y = BinaryGibbs_Pred(estimates, Test_X)
# Finding the accucary of prediction on the test set
BinaryGibbs_Test_Accuracy(Predicted_Y, Test_Y)

A.4 Analysis of Convergence of the chain for Bayesian Probit Regression via Traceplots

To study the convergence of the gibbs sampler algorithm we need to draw the traceplot of parameter estimates that plots the estimated posterior means of various parameters over all the iterations of the chain. We should expect the chain to more or less stabilize around the estimated posterior mean as we progress through the iterations. BinaryGibbs_Traceplot generates plots for diagnosis of Parameters estimates by Probit Regression for Binary Responses via data augmentation and Gibbs sampling.

# Ploting the traceplot for Beta_0
BinaryGibbs_Traceplot(beta_matrix = Result$beta_matrix, k = 0)

The plots can be generated for other regression parameters changing the function inputs.

A.5 Ploting Posterior Distributions of the Estimated Model Parameters for Bayesian Probit Regression

Studying the posterior distribution of the estimated model parameters is neccesary to observe the dispersion of the drawn values around estimated posterior mean. BinaryGibbs_PosteriorDistribution_plot Plots Posterior Frequency Distribution of Parameters estimated by Probit Regression for Binary Responses via data augmentation and Gibbs sampling.

# Ploting the Posterior Distribution for Beta_0
BinaryGibbs_PosteriorDistribution_plot(beta_matrix = Result$beta_matrix , k = 0, burn_in = 2500, breaks= 50)

The plots can be generated for other regression parameters changing the function inputs.

The idea discussed above can easily be generalised in modeling ordered multinomial responses.Let us briefly discuss the theory , followed by usage of the r package PolyGibbs to impleament Bayesian Ordered Multinomial Regression.

B. Bayesian Ordered Multinomial Regression via Data Augmentation and Gibbs Sampler Algorithm

Suppose $Y_1, Y_2, ..., Y_N$ are observed where each $Y_i$ takes one of $J$ categories $1, 2, ...,J$. Letting $p_{ij}$ = $P[Y_{i} = j]$ , define $\eta_{ij} = \sum_{i=1}^j p_{ij}$ for $j = 1,2,.., J-1$. Then one popular model for $p_{ij}$ is given by $\eta_{ij} = \Phi(\gamma_{j} - x_i^T\beta)$ for $i = 1,2,...,N$ and $j = 1,2,..,J-1$. One can motivate this model by assuming that there exists a latent continuous random variable $Z_i$ distributed as $N(x_i^T\beta,1)$ and we observe $Y_i$ where $Y_i = j$ if $\gamma_{j-1} < Z_i < \gamma_{j}$ with $\gamma_0 = -\infty$ and $\gamma_J = \infty$. In this model the regression parameter $\beta$ and bin boundaries $\gamma = [\gamma_1,\gamma_2,...,\gamma_{J-1}]$ are unkonown parameters. The joint posterior density of $\beta$ and $\gamma$ is given by

$$ \pi(\beta,\gamma|y) = C\pi(\beta,\gamma)\prod_{i=1}^N\sum_{j=1}^J 1_{y_i = j} [\Phi(\gamma_{j} - x_i^T\beta) - \Phi(\gamma_{j-1} - x_i^T\beta)] $$ where $\pi(\beta,\gamma)$ is a prior on $(\beta,\gamma)$. If we assign a diffused prior on $(\beta,\gamma)$, then the joint posterior density of $(\beta,\gamma, Z)$ given data is given by

$$ \pi(\beta,\gamma|y) = C\prod_{i=1}^N (1/2\pi)^{1/2}exp(-(Z_i - x_i^T\beta)^2/2)[\sum_{j=1}^J 1_{y_i = j} 1_{\gamma_{j-1}<Z_i<\gamma_j}] $$ The posterior distribution of $\beta$ conditional on $y,Z$ is given by a multivariate normal distribution of the form discussed in the probit regression section. The fully conditional posterior distributions of $Z_1,Z_2,..,Z_N$ independent with $Z_i|\beta,\gamma,y_i = j$ distributed $N(x_i^T\beta,1)$ with trunacated at left(right) by $\gamma_{j-1}$($\gamma_j$). The fully conditional distribution of $\gamma_j$ given $(\beta,y,Z)$ and $\gamma_k$ $k\neq j$ is proportional to $$ \prod_{j=1}^J[1_{y_i = j} 1_{\gamma_{j-1}<Z_i<\gamma_j} + 1_{y_i = j} 1_{\gamma_{j}<Z_i<\gamma_{j+1}}] $$ which can be seen as uniform on the interval $(\max(\max(Z_i: Y_i = j), \gamma_{j-1}), \min(\min(Z_i: Y_i = j+1), \gamma_{j}))$.

B.1 Data Generation & Fitting Bayesian Ordered Multinomial Regression

First we create a simulated data set of n observations each with a multinomial response (1,2,...,K) and coressponding vlues on p covariates, to demonstrate the utility of the package functionalities. (Please note that ,the data generated in this section will be used throughout this article.) Then we use "MultinomGibbs_fit" function to implement via data Bayesian Ordered Multinomial Regression augmentation and Gibbs sampling.

# Generating Simulated Data 
# Initialization
set.seed(250)
n <- 1000 # Total no of observations.
int1 <- -1 # gamma boundary
int2 <- 3  # gamma boundary
beta <- c(-.75, 1) # Regression Parameters for data generation.
X <- cbind(sample(1:4, n, replace = TRUE), rnorm(n, 0, 2)) # Generated design matrix
# Generation of Latent Variable Observations
eta <- X %*% beta
z <- rnorm(n, eta, 1)
# Generation of Responses depending on z
y <- rep(0, n)
y[z <= int1] <- 1
y[int1 <z & z <= int2] <- 2
y[int2 < z ] <- 3
#Spliting The Data in Train and Test in 80:20 ratio
Train_ID = sample(1:nrow(X), round(nrow(X) * 0.8), replace = FALSE) # Train Data IDS
Train_X = X[Train_ID, ]# Train Data Covariates
Test_X = X[-Train_ID, ]
Train_Y = y[Train_ID] # Train Data Response
Test_Y = y[-Train_ID] # Test Data Response
K = 3
nIter = 10000
burn_in = 5000
# Fitting Bayesian Ordered Multinomial Regression
# Uneccesasy print messages are supressed to keep the vigenette tidy
invisible(capture.output(Result <- MultinomGibbs_fit(Train_X, Train_Y, nIter, burn_in, K)))
# Output of the tail.
list(estimates = Result$estimates,Train_Accuracy_tail = tail(Result$Train_Accuracy, n =10),beta_matrix_tail= tail(Result$beta_matrix, n = 10))
# The actual function outputs are estimates, Train_Accuracy, beta_matrix.

B2. Prediction via fitted Bayesian Ordered Multinomial Model on Test Set

Once we have fitted the model, we need to test the predictive power of the model on a test set. Function "MultinomGibbs_pred" is used for prediction using the fitted Bayesian Ordered Multinomial Model on the test set.

# Storing the outputs of the model fitting
estimates = Result$estimates
gamma_estimates = Result$gamma_estimates
# Prediction using the fitted model
MultinomGibbs_pred(estimates, gamma_estimates,Test_X )

B3. Calculation of Test Set Accuracy for Bayesian Ordered Multinomial Regression Model

Once we have fitted the Bayesian Ordered Multinomial Regression model and found out the model predictions on a test set, we want to calculate the test set accuracy comparing the actual and predicted levels of the responses on the test set. This will let us comment on the efficacy of the fitted model for the purpose of prediction. MultinomGibbs_Test_Accuracy Calculates Accuracy of Prediction on Test Set for Bayesian Probit Regression.

# Storing the outputs of the model fitting and predictions
estimates = Result$estimates
Predicted_Y = MultinomGibbs_pred(estimates, gamma_estimates,Test_X )
# Finding the accucary of prediction on the test set
MultinomGibbs_Test_Accuracy(Predicted_Y, Test_Y, K)

B4. Analysis of Convergence of the chain for Bayesian Ordered Multinomial Regression via Traceplots for Regression Parameters

To study the convergence of the gibbs sampler algorithm we need to draw the traceplot of regression parameter estimates that plots the estimated posterior means of various parameters over all the iterations of the chain. We should expect the chain to more or less stabilize around the estimated posterior mean as we progress through the iterations. Multinom_traceplot_beta generates plots for diagnosis of Parameters estimates by Probit Regression for Binary Responses via data augmentation and Gibbs sampling.

# Storing the outputs of the model fitting
beta_matrix = Result$beta_matrix
# Ploting the traceplot for Beta_0
Multinom_traceplot_beta(beta_matrix = beta_matrix, k = 1)

The plots can be generated for other regression parameters changing the function inputs.

B5. Analysis of Convergence of the chain for Bayesian Ordered Multinomial Regression via Traceplots for Boundary Parameters

To study the convergence of the gibbs sampler algorithm we need to draw the traceplot of boundary parameter estimates that plots the estimated posterior means of various parameters over all the iterations of the chain. We should expect the chain to more or less stabilize around the estimated posterior mean as we progress through the iterations. Multinom_traceplot_gamma generates plots for diagnosis of Parameters estimates by Probit Regression for Binary Responses via data augmentation and Gibbs sampling.

# Storing the outputs of the model fitting
gamma_update = Result$gamma_update
# Ploting the traceplot for Beta_0
Multinom_traceplot_gamma(gamma_update = gamma_update , k = 3)

The plots can be generated for other boundary parameters changing the function inputs.

B6. Ploting Posterior Distributions of the Estimated Regression Parameters for Bayesian Ordered Multinomial Regression

Studying the posterior distribution of the estimated regression parameters is neccesary to observe the dispersion of the drawn values around estimated posterior mean. Multinom_PosteriorDistribution_plot_beta Plots Posterior Frequency Distribution of Parameters estimated by Multinomial Regression for Multinomial Responses via data augmentation and Gibbs sampling.

# Storing the outputs of the model fitting
beta_matrix = Result$beta_matrix
# Ploting the Posterior Distribution for Beta_2
Multinom_PosteriorDistribution_plot_beta(beta_matrix = beta_matrix , k = 2, burn_in = 2500, breaks= 50)

The plots can be generated for other regression parameters changing the function inputs.

B7. Ploting Posterior Distributions of the Estimated Boundary Parameters for Bayesian Ordered Multinomial Regression

Studying the posterior distribution of the estimated boundary parameters is neccesary to observe the dispersion of the drawn values around estimated posterior mean. Multinom_PosteriorDistribution_plot_gamma Plots Posterior Frequency Distribution of Parameters estimated by Multinomial Regression for Multinomial Responses via data augmentation and Gibbs sampling.

# Storing the outputs of the model fitting
gamma_update = Result$gamma_update
# Ploting the Posterior Distribution for Beta_2
Multinom_PosteriorDistribution_plot_gamma(gamma_update = gamma_update , k = 2, burn_in = 2500, breaks= 50)

The plots can be generated for other boundary parameters changing the function inputs.

References

[1] James H. Albert and Siddhartha Chib, 1993. Bayesian Analysis of Binary and Polychotomous Response Data.

[2] Istvan T Hernadvolgyi, 1998. Generating Random Vectors from Multivariate Normal Distributions.

[3] Zdravko Botev and Pierre LEcuyer, 2017. Simulation from the Normal Distribution Truncated to an Interval in the Tail.

Contact

For further details please look at the function documentations of the package. Any suggestions for improvement of the package is welcome. For further queries please write to cabhisek@stat.tamu.edu .



zovialpapai/PolyGibbs documentation built on Dec. 9, 2019, 6:52 a.m.