As elaborated in our recent analyses (Walker et al. 2019; Scarpone et
al. 2020), nearly all previous studies in the literature use either
census unit boundaries or simple buffer zones to measure an individual’s
built environment (BE) exposures or to characterize their local
socioeconomic status (SES) (Gong et al. 2014; Fuertes et al. 2014).
Therefore, we present a distance-weighted, network-based model for
quantifying the combined effects of local greenspace and SES on diabetes
risk, from which we derive an area-based Diabetes Risk Index of
Greenspace, Land Use and Socioeconomic Environments (DRI-GLUCoSE). The
goal of the DRIGLUCoSE
package is to provide a public package
containing functions and code used in the development of the DRI-GLUCoSE
Index(Walker et al. 2022).
You can install the latest version of DRIGLUCoSE
from GitHub with:
remotes::install_git("https://github.com/STBrinkmann/DRIGLUCoSE")
Once installed, the library can be loaded as follows:
library(DRIGLUCoSE)
One key purpose of this package is, to provide functions for route networked derived isochrones. For that purpose we have provided a sample sf object of 2 points in Erlangen, Germany.
data(Erlangen)
Erlangen
## Simple feature collection with 2 features and 2 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 35199.46 ymin: -159433.5 xmax: 36281.59 ymax: -159243.2
## Projected CRS: ETRS89 / LCC Germany (N-E)
## # A tibble: 2 x 3
## tag Speed geom
## <dbl> <dbl> <POINT [m]>
## 1 1 78.5 (35199.46 -159433.5)
## 2 2 79.8 (36281.59 -159243.2)
In our analysis we acquired data of the Canadian census dissemination areas. It has been converted to a shapefile (sf) with one column per census variable. To demonstrate we use the following randomly generated data:
set.seed(1234)
census <- sf::st_make_grid(
# Use Sample Data and apply 25 minutes buffer (Speed[m/min] * 25[min])
Erlangen %>% dplyr::mutate(geom = sf::st_buffer(geom, Speed*25)),
cellsize = 100
) %>%
sf::st_as_sf() %>%
dplyr::mutate(census_var_a = sample(1:1000, n(), replace = TRUE),
census_var_b = sample(1000:10000, n(), replace = TRUE),
census_var_c = sample(100000:150000, n(), replace = TRUE)) %>%
dplyr::rename(geom = x)
census
## Simple feature collection with 2142 features and 3 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 33236.96 ymin: -161396 xmax: 38336.96 ymax: -157196
## Projected CRS: ETRS89 / LCC Germany (N-E)
## First 10 features:
## census_var_a census_var_b census_var_c geom
## 1 284 2009 114811 POLYGON ((33236.96 -161396,...
## 2 848 3318 139961 POLYGON ((33336.96 -161396,...
## 3 918 3954 112673 POLYGON ((33436.96 -161396,...
## 4 101 3359 137505 POLYGON ((33536.96 -161396,...
## 5 623 3107 109201 POLYGON ((33636.96 -161396,...
## 6 905 2630 131656 POLYGON ((33736.96 -161396,...
## 7 645 6512 135789 POLYGON ((33836.96 -161396,...
## 8 934 9945 115006 POLYGON ((33936.96 -161396,...
## 9 400 3583 111379 POLYGON ((34036.96 -161396,...
## 10 900 7778 137576 POLYGON ((34136.96 -161396,...
In our analysis we acquired LANDSAT images through the United States
Geological Survey’s EarthExplorer platform
(https://earthexplorer.usgs.gov/). The Normalized Difference
Vegetation Index
(NDVI)
is used as a metric to model greenspace exposure. Pre-processing of the
LANDSAT images and NDVI calculation has been conducted using the
LS_L1C
function:
DRIGLUCoSE::LS_L1C(l1c_path = "docs/LC08_L1TP_193026_20200423_20200508_01_T1_small/",
out_dir = "docs/LS_PreProcessed",
# Use Sample Data and apply 25 minutes buffer (Speed[m/min] * 25[min])
sf_mask = DRIGLUCoSE::Erlangen %>%
dplyr::mutate(geom = sf::st_buffer(geom, Speed*25)),
cores = 20)
## Project raster
## DN to TOA Reflectance
## class : RasterStack
## dimensions : 122, 151, 18422, 8 (nrow, ncol, ncell, nlayers)
## resolution : 30, 30 (x, y)
## extent : 33493.69, 38023.69, -161164.2, -157504.2 (xmin, xmax, ymin, ymax)
## crs : +proj=lcc +lat_0=51 +lon_0=10.5 +lat_1=48.6666666666667 +lat_2=53.6666666666667 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs
## names : Blue, Green, Red, NIR, SWIR1, SWIR2, NDWI, NDVI
## min values : 0, 0, 0, 0, 0, 0, -1, -1
## max values : 0.2020575, 0.2322532, 0.3076383, 0.5424371, 0.4233773, 0.3753066, 1.0000000, 1.0000000
In order to estimate each participant’s potential exposures to greenspace and local SES, we (i) mapped age- and sex-specific walkable zones around their residential address, and (ii) applied a negative logit weighting function, such that the estimated effect of greenspace or SES decreases as distance from the home increases.
In order to compute network-based distance metrics, we acquired street
data from OpenStreetMap using the R-package osmdata
(Padgham et al.
2017). Road types not suitable for walking were removed (e.g.,
motorways). Network data were topologically corrected and split into
\~20 metre-long segments using the R package nngeo
(Michael Dorman
2020).
erlangen.osm <- DRIGLUCoSE::osm_roads(x = Erlangen, dist = 20,
speed = "Speed", cores = 2)
This network data was used to derive walking distance buffers for each participant, based on walking speed. Starting from each participant’s place of residence, we computed network-constrained buffers with an off-road width of 40 meters, running in 2-minute increments from 0 to 20 minutes, using the A*-algorithm (Hart, Nilsson, and Raphael 1968). This therefore resulted in each participant having ten concentric isochrones, the sizes of which are a function of individual walking speed and road network.
erlangen.isodistances <- DRIGLUCoSE::isodistances(x = Erlangen,
road_network = erlangen.osm,
tag = "tag", speed = "Speed",
isochrones_seq = seq(2, 20, 2),
cores = 2)
erlangen.isochrones <- DRIGLUCoSE::isochrones(x = erlangen.isodistances,
buffer = 40, cores = 2)
erlangen.isochrones
## Simple feature collection with 20 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 34034 ymin: -160951 xmax: 37750 ymax: -157782
## Projected CRS: ETRS89 / LCC Germany (N-E)
## # A tibble: 20 x 3
## time tag geom
## * <dbl> <dbl> <MULTIPOLYGON [m]>
## 1 2 1 (((35292.18 -159582.1, 35292.75 -159584.1, 35293.22 -159586.1, 3~
## 2 4 1 (((35314.38 -159725.3, 35313.74 -159727.3, 35312.99 -159729.2, 3~
## 3 6 1 (((35348.37 -159830.5, 35347.9 -159831.2, 35347.44 -159832, 3534~
## 4 8 1 (((35282 -159969.8, 35281.96 -159971.9, 35281.8 -159974, 35281.5~
## 5 10 1 (((35249.78 -160125.6, 35249.57 -160125.5, 35248.86 -160125.1, 3~
## 6 12 1 (((35264.36 -160300.8, 35263.71 -160300.8, 35263.06 -160300.9, 3~
## 7 14 1 (((35379.71 -160416.5, 35380.79 -160418.3, 35381.78 -160420.1, 3~
## 8 16 1 (((35412 -160571, 35411.95 -160573.1, 35411.78 -160575.2, 35411.~
## 9 18 1 (((35430.23 -160426.8, 35430.28 -160426.3, 35430.28 -160426.3, 3~
## 10 20 1 (((35546 -160803.2, 35546.09 -160803.5, 35546.22 -160804.2, 3554~
## 11 2 2 (((36385.71 -159372.9, 36384.21 -159374.3, 36382.64 -159375.7, 3~
## 12 4 2 (((36379.78 -159545.8, 36379.83 -159546.5, 36379.9 -159547.1, 36~
## 13 6 2 (((36368.01 -159688.3, 36366.95 -159690.1, 36365.8 -159691.9, 36~
## 14 8 2 (((36655.01 -159679.5, 36655.61 -159681.6, 36656.11 -159683.6, 3~
## 15 10 2 (((36520.27 -159907.5, 36518.33 -159908.3, 36516.36 -159909, 365~
## 16 12 2 (((36053.07 -160058.5, 36051.72 -160056.9, 36050.46 -160055.3, 3~
## 17 14 2 (((36832.82 -159976.8, 36832.8 -159977, 36832.79 -159977.1, 3683~
## 18 16 2 (((36016.71 -160325, 36016.5 -160325.4, 36016.23 -160325.9, 3601~
## 19 18 2 (((36020.67 -160327.7, 36022.64 -160327, 36024.65 -160326.4, 360~
## 20 20 2 (((36689.36 -160321.8, 36690.36 -160320, 36691.45 -160318.2, 366~
Figure 1 shows isodistances of the two points of the sample data in Erlangen, Germany.
In order to account for the diminishing effect of SES and greenspace exposure as distance increases, we fitted a logit function to weight each incremental isochrone, such that the influence of a variable decreases with increasing distance from the household, i.e., features that are farther away have less influence than nearby features, as illustrated in Figure 2. A logit function was selected as it heuristically approximates a suitable distance-decay function (Bauer and Groneberg 2016; Jia, Wang, and Xierali 2019). The distance-weighting is separated in two parts, first the logit function (1) that is used for both SES and greenspace variables, and second the proportional weights function (4) that is only applied on SES variables.
Each isochrone
is assigned a distance weight
,
calculated as the integral of the logistic distance decay function
(2)
with
and
,
in the interval between the mean inner radius
and mean outer radius
of the isochrone (e.g. 2 to 4 minutes isochrones), normalized by the
integral from 0 to the outermost isochrone boundary
(e.g. 20 minutes isochrone). Weighted summary statistics to describe the
greenspace (e.g. mean or minimum NDVI) are thus described as (3)
For SES variables the proportional weights of the census areas within the isochrone are further defined as (4)
with the proportion of the area of the intersection of the census area
and the isochrone
,
and the area of the isochrone
.
The weighted value of the SES variable
in the census area
is then defined as (5)
Figure 2 visualizes the different submodels used for distance-weighting SES and greenspace. Fig. 2a shows the unweighted values of a SES variable and fig. 2b has been calculated using (5), thus representing the proportional weights of all intersections with the census areas and isochrones. Greenspace is weighted as shown in fig. 2c using (3).
The distance-weighting for the LANDSAT derived NDVI raster (greenspace
exposure) is handled using LS_band_weightin
, and SES distance- and
areal-weighting using census_weighting
.
# Calculate sd, median, 5th percentile, 95th percentile and skew of NDVI values
NDVI_weighted <-
DRIGLUCoSE::LS_band_weighting(isochrones = erlangen.isochrones, tag = "tag",
landsat_list = dir("docs/LS_PreProcessed",
pattern = ".grd",
full.names = T) %>%
lapply(raster::brick),
stats = list("sd", "median",
list("percentile", 0.05),
list("percentile", 0.95),
"skew"),
b = 8, m = 0.6, cores = 2)
NDVI_weighted
## # A tibble: 2 x 6
## tag sd median X5_percentile X95_percentile skew
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.205 0.615 0.281 0.914 -0.145
## 2 2 0.104 0.540 0.360 0.714 -0.0122
census_weighted <- DRIGLUCoSE::census_weighting(isochrones = erlangen.isochrones,
tag = "tag", census = census,
b = 8, m = 0.6, cores = 2)
census_weighted
## # A tibble: 2 x 4
## tag census_var_a census_var_b census_var_c
## <dbl> <dbl> <dbl> <dbl>
## 1 1 562. 5323. 130970.
## 2 2 547. 5419. 124610.
DRI-GLUCoSE scores for Vancouver (top) and Hamilton (bottom), ranging from low risk (purple) to high risk areas (orange).
To analyse the effect of socioeconomic status (SES) and greenspace (GS), we further build multivariable models using the semi-adjusted model with BMI as obesity measurent and tested different combinations for the index variable.
Table A.2: Model Performance and odds ratios of the logistic models, comparing combinations of socioeconomic status (SES) and greenspace (GS) as index. Metric SES + GS SES GS Probability Threshold * 0.47 0.50 0.48 Accuracy 0.75 0.72 0.74 Sensitivity 0.76 0.72 0.75 Specificity 0.65 0.68 0.68 Youden index 0.41 0.40 0.43 OR (95% CI, p-value) 0.46 (0.35-0.61, p < 0.001) 0.57 (0.42-0.76, p < 0.001) 0.42 (0.31-0.57, p < 0.001) * Probability threshold used for predicting Diabetes. Values equal or greater than this threshold are mapped as “No”. Table A.3: Model Performance for all multivariable models.Brinkmann, Sebastian Tobias (Package creator and author) e-mail: sebastian.brinkmann@fau.de
Große, Tim (Contributor)
Walker, Blake Byron (1*) Brinkmann, Sebastian Tobias (1) Große, Tim (1) Dominik Kremer (1) Schuurman Nadine (2) Hystad Perry (3) Rangarajan Sumathy (4) Teo Koon (4) Yusuf Salim (4) Lear Scott A. (5)
1: Community Health Environments and Social Terrains (CHEST) Lab, Institut für Geographie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wetterkreuz 15, 91052 Erlangen, Germany
*corresponding author
2: Department of Geography, Simon Fraser University, Burnaby, Canada
3: Spatial Health Lab, College of Public Health and Human Sciences, Oregon State University, Corvallis, USA
4: Population Health Research Institute, McMaster University, Hamilton, Canada
5: Faculty of Health Sciences, Simon Fraser University, Burnaby, Canada
citation("DRIGLUCoSE")
##
## To cite DRIGLUCoSE in publications use:
##
## Walker, B.B., Brinkmann, S.T., Große, T. et al. Neighborhood
## Greenspace and Socioeconomic Risk are Associated with Diabetes Risk
## at the Sub-neighborhood Scale: Results from the Prospective Urban and
## Rural Epidemiology (PURE) Study. J Urban Health 99, 506–518 (2022).
## https://doi.org/10.1007/s11524-022-00630-w
##
## A BibTeX entry for LaTeX users is
##
## @Article{,
## title = {Neighborhood Greenspace and Socioeconomic Risk are Associated with Diabetes Risk at the Sub-neighborhood Scale: Results from the Prospective Urban and Rural Epidemiology (PURE) Study},
## author = {Blake Byron Walker and Sebastian T. Brinkmann and T. Große et al.},
## journal = {J Urban Health},
## year = {2022},
## volume = {99},
## pages = {506–518},
## url = {https://doi.org/10.1007/s11524-022-00630-w},
## }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.