Description Usage Arguments Details References See Also Examples
Mean posterior probability for each observation in Baysian quantile regression model
1 | bayesProb(y, x, tau, M, burn)
|
y |
vector, dependent variable in quantile regression |
x |
matrix, design matrix in quantile regression |
tau |
quantile |
M |
MCMC draws |
burn |
burned MCMC draws |
If we define the variable O_i, which takes value equal to 1 when ith observation is an outlier, and 0 otherwise, then we propose to calculate the probability of an observation being an outlier as:
P(O_{i} = 1) = \frac{1}{n-1}β{P(v_{i}>v_{j}|data)} \quad (1)
We believe that for points, which are not outliers, this probability should be small, possibly close to zero. Given the natrual ordering of the residuals, it is expected that some observations present greater values for this probability in comparison to others. What we think that should be deemed as an outlier, ought to be those observations with a higher P(O_{i} = 1), and possibly one that is particularly distant from the others.
The probability in the equation can be approximated given the MCMC draws, as follows
P(O_{i}=1)=\frac{1}{M}β{I(v^{(l)}_{i}>max v^{k}_{j})}
where M is the size of the chain of v_{i} after the burn-in period and v^{(l)}_{j} is the lth draw of chain.
More details please refer to the paper in references
Santos B, Bolfarine H.(2016)βOn Baysian quantile regression and outliers,arXiv:1601.07344
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## Not run:
ais_female <- subset(ais, Sex == 1)
y <- ais_female$BMI
x <- cbind(1, ais_female$LBM)
tau <- 0.5
M <- 5000
burn <- 1000
prob <- bayesProb(y, x, tau, M, burn)
case <- 1:100
dat <- data.frame(case, prob)
ggplot(dat, aes(case, prob))+
geom_point() +
geom_text(data = subset(dat, prob > mean(prob) + 2*sd(prob)),
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.