Description Usage Arguments Details Value Author(s) References See Also

This function cacluate the mean probability of posterior of Baysian quantile regression model with asymmetric laplace distribution

1 | ```
qrod_bayes(y, x, tau, M, burn, method = c("bayes.prob", "bayes.kl"))
``` |

`y` |
dependent variable in quantile regression |

`x` |
matrix, design matrix for quantile regression. For quantile regression model with intercept, the firt column of x is 1. |

`tau` |
quantile |

`M` |
the iteration frequancy for MCMC used in Baysian Estimation |

`burn` |
burned MCMC draw |

`method` |
the diagnostic method for outlier detection |

If we define the variable Oi, which takes value equal to 1 when ith observation is an outlier, and 0 otherwise, then we propose to calculate the probability of an observation being an outlier as:

*P(O_{i} = 1) = \frac{1}{n-1}∑{P(v_{i}>v_{j}|data)} \quad (1)*

We believe that for points, which are not outliers, this probability should be
small, possibly close to zero. Given the natrual ordering of the residuals, it is
expected that some observations present greater values for this probability in
comparison to others. What we think that should be deemed as an outlier, ought to
be those observations with a higher *P(O_{i} = 1)*, and possibly one that is
particularly distant from the others.

The probability in the equation can be approximated given the MCMC draws, as follows

*P(O_{i}=1)=\frac{1}{M}∑{I(v^{(l)}_{i}>max v^{k}_{j})}*

where *M* is the size of the chain of *v_{i}* after the burn-in period and
*v^{(l)}_{j}* is the *l*th draw of chain.

Another proposal to address these differences between the posterior distributions from the distinct latent variables in the model, we suggest the use of the Kullback- Leibler divergence proposed by Kullback and Leibler(1951), as a more precise method of measuring the distance between those latent variables in the Bayesian quantile regression framework. In this posterior information, the divergence is defined as

*K(f_{i}, f_{j}) = \int log(\frac{f_{i}(x)}{f_{j}{(x)}})f_{i}(x)dx*

where *f_{i}* could be the posterior conditional distribution of *v_{i}*
and *f_{j}* the poserior conditional distribution of *v_{j}*. Similar to
the probability proposal in the previous subsection, we should average this
divergence for one observation based on the distance from all others, i.e,

*KL(f_{i})=\frac{1}{n-1}∑{K(f_{i}, f_{j})}*

We expect that when an observation presents a higher value for this divergence, it should also present a high probability value of being an outlier. Based on the MCMC draws from the posterior of each latent vaiable, we estimate the densities using a normal kernel and we compute the integral using the trapezoidal rule.

Mean probability or Kullback-Leibler divergence for observations in Bayesian quantile regression model

Wenjing Wang [email protected]

Benites L E, Lachos V H, Vilca F E.(2015)“Case-Deletion
Diagnostics for Quantile Regression Using the Asymmetric Laplace
Distribution,*arXiv preprint arXiv:1509.05099*.

Hawkins D M, Bradu D, Kass G V.(1984)“Location of several outliers in
multiple-regression data using elemental sets. *Technometrics*,
26(3), 197-208.

Koenker R, Bassett Jr G.(1978)“ Regression quantiles,
*Econometrica*, 1, 33-50.

Santos B, Bolfarine H.(2016)“On Baysian quantile regression and
outliers,*arXiv:1601.07344*

Kozumi H, Kobayashi G.(2011)“Gibbs sampling methods for Bayesian
quantile regression,*Journal of statistical computation and
simulation*, 81(11), 1565-1578.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.