# This chunk ensures that the reedtemplates package is installed and loaded
# This reedtemplates package includes the template files for the thesis and also
# two functions used for labeling and referencing
if(!require(devtools))
  install.packages("devtools", repos = "http://cran.rstudio.com")

if(!require(reedtemplates)){
  library(devtools)
  devtools::install_github("ismayc/reedtemplates")
  }
library(reedtemplates)

Unfinished Work: Incorporating Routes {#routes}

These models so far do not incorporate routes. Though our initial aim was to create models that use the routes, we were not able to transform the route to a state that was useful for modeling. I leave the models as they are, but here I explain some of my work toward the goal of incorporating routes and describe some potential modeling approaches. Throughout this work is the caveat that many of these results are hard to interpret without taking into account route. We hope future researchers will be able to accomplish this.

Data Structures for Routes

Before one can model routes, they need to be represented properly. Ride Report's approach has been to discretize the routes into road segments by map matching the GPS traces to road segments in the Open Street Map (OSM) data. Road segments are readily availible in the OSM data sets as well as in shapefiles published by cities. In particular, Civicapps.org, a public data portal for the Portland, OR area has a shapefile of the bicycle network in Portland, OR, with information about the type of roads, presence of bike lanes, and other useful information about the roads bikes have access to in Portland.

Road segments are not the only way to represent routes, however. One could also consider a route as a sequence of intersections or intersections and road segments. Though modeling intersections may be more difficult, it's likely they could be more interesting parts of the routes. In the bike accident data examined by Meyers, most of the accidents occurred at intersections^[@meyers2015], so it would make sense if these were often the most stressful and dangerous portion of riders' routes.

For now, however, the most readily available data are on road segments.

Regression Terms for Road Segments

How could we use our knowledge of riders' routes into our regression? The approach we present here will be to consider routes as sequences of discrete road segments, each of which have known properties. As mentioned before, there are data about roads that give information about bike lanes, road size, and other attributes of road segments.

Ideally we would like to estimate some parameter for each road segment that indicated its typical contribution to the probability of a negative ride rating. The most immediate hurdle is figuring out how to estimate all of those parameters, particularly when the number of rides per a particular road segment may be low.

Bayesian inference may be the best bet to get something for each road segment, but as a extremely simple example, we outline here a method that would be easy to fit, but likely not a very good model. Regardless, it gives a good idea of how the ideas of multilevel models could be adapted for use with road segments, despite the lack of a clear hierarchy. Assume we have $K$ total road segments in our road network and for each ride we have $\Omega_i \subseteq {1, \ldots, K}$, the set of road segments that are in the route of ride $i$. Let $l_k$ be the length of the $k$th segment and define the length of ride $i$ to be:

$$ L_i = \sum_{k \in \Omega_i} l_k.$$

For the $k$th road segment, define the $m$-dimensional vector $W_k = W_k^1, W_k^2, \ldots, W_k^m$ road segment-level predictors. Then we shall define the term in our regression for the route of ride $i$ as

$$ R_i = \frac{1}{L_i} \sum_{k \in \Omega_i} l_k W_k \beta^{\text{road}},$$

Where $\beta^{\text{road}}$ is a vector of coefficients for the road segment level predictors. When actually computing this value, we can factor out the $\beta^{\text{road}}$, and then the rest of $R_i$ is just a transformation of road-level variables.



wjones127/thesis documentation built on May 4, 2019, 7:34 a.m.