ordmonoreg: Bayesian monotonic regression
In monoreg: Bayesian Monotonic Regression Using a Marked Point Process Construction

View source: R/monoreg.R

ordmonoreg

R Documentation

Bayesian monotonic regression

Description

This function implements the non-parametric Bayesian monotonic regression procedure for ordinal outcomes described in Saarela, Rohrbeck & Arjas (2023).

Usage


ordmonoreg(niter=15000, burnin=5000, adapt=5000, refresh=10, thin=20, 
		   birthdeath=1, logit=FALSE, gam=FALSE, seed=1, rhoa=0.1, rhob=0.1, 
		   deltai=0.5, dlower=0, dupper=1, invprob=1.0, dc=0.0, 
		   predict, include, outcome, axes, covariates=NULL, 
		   cluster=NULL, ncluster=NULL, settozero)

Arguments

`niter`	Total number of MCMC iterations.
`burnin`	Number of iterations used for burn-in period.
`adapt`	Number of iterations used for adapting Metropolis-Hastings proposals.
`refresh`	Interval for producing summary output of the state of the MCMC sampler.
`thin`	Interval for saving the state of the MCMC sampler.
`birthdeath`	Number of birth-death proposals attempted at each iteration.
`logit`	Indicator for fitting the model on logit scale.
`gam`	Indicator for fitting non-monotonic generalized additive models for covariate effects.
`seed`	Seed for the random generator.
`rhoa`	Shape parameter of a Gamma hyperprior for the Poisson process rate parameters.
`rhob`	Scale parameter of a Gamma hyperprior for the Poisson process rate parameters.
`deltai`	Range for a uniform proposal for the function level parameters.
`dlower`	Lower bound for the allowed range for the monotonic function.
`dupper`	Upper bound for the allowed range for the monotonic function.
`invprob`	Probability with which to propose keeping the original direction of monotonicity.
`dc`	First parameter of the conditional beta prior to counter spiking behaviour near the origin - 1.
`predict`	Indicator vector for the observations for which absolute risks are calculated (not included in the likelihood expression).
`include`	Indicator vector for the observations to be included in the likelihood expression.
`outcome`	Vector of outcome variable values.
`axes`	A matrix where the columns specify the covariate axes in the non-parametrically specified regression functions. Each variable here must be scaled to zero-one interval.
`covariates`	A matrix of additional covariates to be included in the linear predictor of the events of interest (optional).
`cluster`	A vector of indicators for cluster membership, numbered from 0 to ncluster-1 (optional).
`ncluster`	Number of clusters (optional).
`settozero`	A zero-one matrix specifying the point process construction. Each row represents a point process, while columns correspond to the columns of the argument `axes`, indicating whether the column is one of the dimensions specifying the domain of the point process. (See function `getcmat`.)

Value

A list with elements

`steptotal`	A sample of total number of points in the marked point process construction.
`steps`	A sample of the number of points used per each additive component.
`rho`	A sample of the Poisson process rate parameters (one per each point process specified).
`loglik`	A sample of log-likelihood values.
`tau`	A sample of random intercept variances.
`alpha`	A sample of random intercepts.
`beta`	A sample of regression coefficients for the variables specified in the argument `covariates`.
`lambda`	A sample of non-parametric regression function levels for each observation specified in the argument `predict`.
`pred`	A sample of predicted means for each observation specified in the argument `predict`.
`sigmasq`	A sample of autoregressive prior variances.

Author(s)

Olli Saarela <olli.saarela@utoronto.ca>, Christian Rohrbeck <cr777@bath.ac.uk>

References

Saarela O., Rohrbeck C., Arjas E. (2023). Bayesian non-parametric ordinal regression under a monotonicity constraint. Bayesian Analysis, 18:193–221.

Examples

library(monoreg)
expit <- function(x) {1/(1+exp(-x))}
logit <- function(p) {log(p)-log(1-p)}
set.seed(1)
# nobs <- 500
nobs <- 200
x <- sort(runif(nobs))
ngrid <- 100
xgrid <- seq(1/ngrid, (ngrid-1)/ngrid, by=1/ngrid)
ngrid <- length(xgrid)
ncat <- 4
beta <- 0.75
disc <- c(Inf, Inf, 0.75, 0.5)
gamma <- c(0,0,0.25,0.5)

surv <- matrix(NA, nobs, ncat)
cdf <- matrix(NA, nobs, ncat)
cols <- c('black','red','blue','green')
for (i in 1:ncat) {
    surv[,i] <- expit(logit((ncat-(i-1))/ncat) + beta * x + 
	                  gamma[i] * (x > disc[i]) - gamma[i] * (x < disc[i]))
    if (i==1)
        plot(x, surv[,i], type='l', col=cols[i], ylim=c(0,1), lwd=2, ylab='S')
    else
        lines(x, surv[,i], col=cols[i], lwd=2)
}
head(surv)

for (i in 1:ncat) {
    if (i<ncat)
        cdf[,i] <- 1.0 - surv[,i+1]
    else
        cdf[,i] <- 1.0
    if (0) {
        if (i==1)
            plot(x, cdf[,i], type='l', col=cols[i], ylim=c(0,1), lwd=2, ylab='F')
        else
            lines(x, cdf[,i], col=cols[i], lwd=2)
    }
}
head(cdf)

u <- runif(nobs)
y <- rep(NA, nobs)
for (i in 1:nobs)
    y[i] <- findInterval(u[i], cdf[i,]) + 1
table(y)

xwindow <- 0.1/2
mw <- matrix(NA, nobs, ncat)
grid <- sort(x)
for (i in 1:nobs) {
    idx <- (x > grid[i] - xwindow) & (x < grid[i] + xwindow)
    for (j in 1:ncat)
        mw[i,j] <- sum(y[idx] >= j)/sum(idx)
}
for (j in 1:ncat) {
    lines(grid, mw[,j], lty='dashed', col=cols[j])
}

# results <- ordmonoreg(niter=15000, burnin=5000, adapt=5000, refresh=10, thin=5, 
results <- ordmonoreg(niter=3000, burnin=1000, adapt=1000, refresh=10, thin=4, 
           birthdeath=1, logit=FALSE, gam=FALSE, seed=1, rhoa=0.1, rhob=0.1, 
           deltai=0.2, dlower=0, dupper=1, invprob=1.0, dc=0.0, 
           predict=c(rep(0,nobs),rep(1,ngrid)), 
           include=c(rep(1,nobs),rep(0,ngrid)), 
           outcome=c(y, rep(1,ngrid)), 
           axes=c(x, xgrid), covariates=NULL, cluster=NULL, ncluster=NULL, 
           settozero=getcmat(1))
		   
s <- results$lambda
dim(s)
lines(xgrid, colMeans(subset(s, s[,1]==0)[,2:(ngrid+1)]))
lines(xgrid, colMeans(subset(s, s[,1]==1)[,2:(ngrid+1)]), col='red')
lines(xgrid, colMeans(subset(s, s[,1]==2)[,2:(ngrid+1)]), col='blue')
lines(xgrid, colMeans(subset(s, s[,1]==3)[,2:(ngrid+1)]), col='green')
legend('bottomright', legend=c(expression(P(Y>=1)), expression(P(Y>=2)), 
expression(P(Y>=3)), expression(P(Y>=4))), 
lwd=2, col=cols)

monoreg documentation built on April 30, 2023, 5:12 p.m.