In apeterson91/rstapDP: Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = T
)

Motivation

The rstap and rsstap packages introduced Spatial Temporal Aggregated Predictor (STAP) models for estimating the effect of spatio-temporal exposure to nearby Built Environment Features (BEFs) on human health. The effects estimated in those models were homogenous; It was assumed in the model formulation that all individuals have the same effect of living near a BEF.The rstapDP package removes this assumption by allowing for heterogeneity in the STAP effect across the population.

Illustrative Example

For example, consider a sample of individuals $(i=1,...,n)$ near Fast Food Restaurants (FFRs) where half of the subjets are at "high risk"of having an average increase in their BMI as a consequence of living near the FFR. In contrast the other half of the population are "low risk" as they experience no increase in BMI. Identifying these effects without knowing these individuals' risk status, or the number of risk populations requires the estimation of some number of STAP effects $f_i(\mathcal{D}_i)$ where $f(\cdot)$ is the (say) spatial exposure function relating how the average health outcome $BMI_i$ increases as a function of where a new FFR is placed relative to the subjects.

The STAP-DP model which rstapDP incorporates these ideas in a regression setting as follows assuming covariate informatino $Z_i$ (e.g. sex and/or an intercept term) is also available:

$$ E[BMI_i] = \mathbf{Z}_i^{T} \mathbf{\delta} + f_i(\mathcal{D}_i) \quad i = 1,...,n. $$

Above, $f_i(\mathcal{D})$, the smooth function of BEF exposure is modeled through spline regression coefficients, $\mathbf{\beta}_i$ similar to the rsstap package. In the rstapDP package, we place a Dirichlet Process (DP) prior on these regression coefficients:

$$ f_i(\mathcal{D}i) = \sum{j=0}^{J}\sum_{d \in \mathcal{D}i} \beta{i,j}\phi_j(d)\ (\mathbf{\beta}i,\mathbf{\tau}_i) \sim DP(\alpha,G_0)\ G_0 \equiv MVN(0,\mathbf{I}\sigma^{-2} \mathbf{\tau}_i)\times \text{Gamma}(a{\tau},b_{\tau})\ \alpha \sim \text{Gamma}(a_{\alpha},b_{\alpha}). $$

By placing a DP prior on the $\mathbf{\beta}$ coefficients, heterogenous effects clustering is induced, allowing for the estimation of the aforementioned high/low risk spatial exposure effects.

In the following sections, this idea will be briefly illustrated using the simple simulated example. For more information, check out the rstapDP documentation.

Exploratory Data Analysis

We'll begin by loading the appropriate libraries.

library(rbenvo)
library(rstapDP)
library(dplyr)
library(ggplot2)
theme_set(theme_bw())

For this simple illustrative example, we'll use the data FFR_benvo (see here for more information on benvos) which comes with the rstapDP package. This data is simulated in the manner previously described: with one high risk group that experiences an increase in the pseudo-BMI measure as a function of nearby FFRs, and one group that does not. There is also a binary sex covariate simulated. We can get a brief description of the FFR data by calling the print and summary functions of the FFR_benvo object.

FFR_benvo

summary(FFR_benvo)

Model Fitting and Results

The following syntax shows how to fit the STAP-DP model. The rstapDPpackage fits the above model using a truncated DP Gibbs Sampler via Rcpp. We'll use the default prior recommendations for the DP base measure and prior precision, drawing 3000 samples with the first 2000 used for burn in. Note that more/fewer iterations may be required depending on the complexity or size of the data.

fit <- fdp_staplm(BMI ~ sex + sap(FFR),
                  benvo = FFR_benvo,
                  K = 20, ## for speed
                  iter_max = 3E3,
                  burn_in = 2E3,
                  thin = 1,
                  seed = 434311,
                  chains = 2)

With our model fitting complete, our first check will be to look the posterior predictive checks via the ppc function.

ppc(fit)

An additional graphical check that can be used to identify clustering patterns is the pairwise probability matrix as described in [@rodriguez p.1136]

plot_pairs(fit,sample=500L,sort=TRUE)

Finally, we'll take at look at the estimated and true exposure curves.

truth <- tibble(Distance = seq(from=0,to=1,by=0.01),
                Truth = 2*pweibull(Distance,shape = 5, scale = .5,lower.tail = F),
                Cluster = "High Risk") %>%
  rbind(tibble(Distance = seq(from =0,to=1,by=0.01),
               Truth = 0,
               Cluster = "Low Risk")) %>% mutate(Cluster = factor(Cluster))

truth <- geom_line(aes(x=Distance,y=Truth,color=Cluster),data=truth)

plot(fit) + truth

The curves themselves, including the distance at which the exposure effect becomes negligible, in addition to the proportion with which individuals are assigned to these curves are the typical primary objectives for those interested in this model. However, we empahsize the importance of using the posterior predictive checks and diagnostics prior to reading too much into any curve or parameter estimates.

References

apeterson91/rstapDP documentation built on Sept. 20, 2023, 9:34 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

apeterson91/rstapDP
Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R

In apeterson91/rstapDP: Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R

Motivation

Illustrative Example

Exploratory Data Analysis

Model Fitting and Results

References

R Package Documentation

Browse R Packages

We want your feedback!

apeterson91/rstapDP Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R

In apeterson91/rstapDP: Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R

Motivation

Illustrative Example

Exploratory Data Analysis

Model Fitting and Results

References

R Package Documentation

Browse R Packages

We want your feedback!

apeterson91/rstapDP
Functional Dirichlet Process Spatial Temporal Aggregated Predictors in R