knitr::opts_chunk$set(
  echo = FALSE,
  cache = TRUE,
  warning = FALSE,
  comment = '', 
  out.width = "1\\linewidth")
if (!require("AccessPack", character.only = TRUE)) {
      remotes::install_github("soukhova/AccessPack",
                        build_vignettes = TRUE)
  }
library(AccessPack)
library(dplyr)
library(fitdistrplus)
library(ggplot2)
library(kableExtra)
library(patchwork)
library(sf)
library(scales)
library(stats)
library(ggspatial)
# library(ggpmisc)
# library(ggrepel)
# library(cowplot)
# library(spdep)
# library(RColorBrewer)
# library(extrafont)
# font_import()
# loadfonts(device = "win")

options(scipen = 999)

Introduction

This manuscript presents the open data product {AccessPack}. Open data products are the result of turning source data (open and/or not) into accessible information in a way that adds value to the original inputs [see @Arribas2021open]. The product presented in this paper is an R data package which consists of six objects and one function. The focus of this manuscript are data objects prepared from the 2016 Transportation Tomorrow Survey (TTS), namely person-to-jobs origin-destinations, as well as traffic analysis zone (TAZ) boundaries for the Greater Golden Horse area (GGH) located in southern Ontario, Canada [@data_management_group_tts_2018]. Data from the TTS are in principle available to the public but are not fully open, since permission to access the data retrieval system is required. In addition, the raw data are technically demanding and cumbersome to work with. By pre-processing these data, {AccessPack} offers a slice of the TTS data useful to understand patterns of commuting to work in the region. In addition, the package includes centroid-to-centroid travel times by car computed using package {r5r} [@Pereira2021r5r]. The aim of this paper is to walk readers through the empirical home-based work commute data set and illustrate the calculation of an impedance function.

Home-to-work commute data

{AccessPack} includes counts of fully-employed population by place of residence and counts of full-time jobs by place of work aggregated by TAZ (n=r round(length(AccessPack::ggh_taz$GTA06), 3) %>% prettyNum(big.mark = ",") within the survey boundaries). Unique identifiers use the GTA06 Zoning System of TTS. The number of jobs (r round(sum(AccessPack::ggh_taz$jobs), 3) %>% prettyNum(big.mark = ",")) and workers (r round(sum(AccessPack::ggh_taz$workers), 3) %>% prettyNum(big.mark = ",")) are organized in the form of an origin-destination table, indicative of patterns of home-to-work commute (there are r round(sum(AccessPack::od_ft_tt$trips), 3) %>% prettyNum(big.mark = ",") potential interactions). These data wwere retrieved from the Transportation Tomorrow Survey Data Retrieval System on October 28, 2021 and reflect the potential interaction of full-time employed people and jobs within the GGH survey boundaries shown in Figure \ref{fig:TTS-16-survey-area} as defined by the 2016 TTS methodology [@data_management_group_tts_2018].

Also included in {AccessPack} are travel times and cost of travel from origin to destination by car; travel times are calculated using the R package {r5r}. These travel times are useful to estimate the cost of travel and to calculate impedance functions, among other possible uses. It is important to note that for simplicity, all interactions within AccessPack are assumed to be taken by car, and the travel time is calculated from an origin TAZ centroid to a destination TAZ centroid. The centroid is snapped to the nearest street line by r5r and the travel time is calculated for all trips assuming a car travel mode and a department of 7:00am on Wednesday October 20 2021, a mid-week date selected by the authors. Additionally, only travel times less than or equal to 180 mins (3 hrs) are calculated; this threshold represents 99% of trip's travel times which are summarized in the descriptive statistics in Table \ref{tab:TTS-16-desc-stats}.

```rThe TTS 2016 study area within the Greater Golden Horseshoe in Ontario, Canada.", out.width="80%", fig.align='center'} knitr::include_graphics("images/Greater-Golden-Horseshoe-Map.png")

```r
#forming a complete descriptive statistic table

Statistics <- data.frame("Statistics" = c("Min.", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max.", "NA's"))

Trips <- data.frame("OD_Trips" = c(summary(od_ft_tt$trips)[[1]] %>% round(), 
                                   summary(od_ft_tt$trips)[[2]] %>% round(),  
                                   summary(od_ft_tt$trips)[[3]] %>% round(), 
                                   summary(od_ft_tt$trips)[[4]] %>% round(), 
                                   summary(od_ft_tt$trips)[[5]] %>% round(),
                                   summary(od_ft_tt$trips)[[6]]%>% round(),
                                   NA))

Travel_time <- data.frame("OD_Travel_time" = c(summary(od_ft_tt$travel_time)[[1]] %>% round(), 
                                               summary(od_ft_tt$travel_time)[[2]] %>% round(),  
                                               summary(od_ft_tt$travel_time)[[3]] %>% round(), 
                                               summary(od_ft_tt$travel_time)[[4]] %>% round(), 
                                               summary(od_ft_tt$travel_time)[[5]] %>% round(), 
                                               summary(od_ft_tt$travel_time)[[6]] %>% round(),  
                                               3507)) 

TAZ_Area <- data.frame("TAZ_Area" = c(summary(ggh_taz$AREA)[[1]] %>% round(), 
                                      summary(ggh_taz$AREA)[[2]] %>% round(), 
                                      summary(ggh_taz$AREA)[[3]] %>% round(), 
                                      summary(ggh_taz$AREA)[[4]] %>% round(), 
                                      summary(ggh_taz$AREA)[[5]] %>% round(), 
                                      summary(ggh_taz$AREA)[[6]] %>% round(), 
                                      NA))

Workers <- data.frame("Workers" = c(summary(ggh_taz$workers)[[1]] %>% round(), 
                                    summary(ggh_taz$workers)[[2]] %>% round(), 
                                    summary(ggh_taz$workers)[[3]] %>% round(), 
                                    summary(ggh_taz$workers)[[4]] %>% round(), 
                                    summary(ggh_taz$workers)[[5]] %>% round(), 
                                    summary(ggh_taz$workers)[[6]] %>% round(), 
                                    NA))

Jobs <- data.frame("Jobs" = c(summary(ggh_taz$jobs)[[1]] %>% round(), 
                              summary(ggh_taz$jobs)[[2]] %>% round(), 
                              summary(ggh_taz$jobs)[[3]] %>% round(), 
                              summary(ggh_taz$jobs)[[4]] %>% round(), 
                              summary(ggh_taz$jobs)[[5]] %>% round(), 
                              summary(ggh_taz$jobs)[[6]] %>% round(), 
                              NA)) 

desc_stats <- cbind(Statistics, Trips, Travel_time, TAZ_Area, Workers, Jobs)

#kable tabling 
desc_stats %>%
  kable(format = "latex",
        align="lrrrrrr",
        booktabs = T,
        col.names = c(" ", "(#)", "(min)", "(km^2)", "(#)", "(#)"),
        caption = "\\label{tab:TTS-16-desc-stats}Descriptive statistics of the trips, workers, and jobs for the traffic analysis zones (TAZ) from the TTS 2016 dataset along with estimated car origin-destination travel times.") %>%
  add_header_above(c(" ", "Trips", "Car Travel Time", "Area", "Workers", "Jobs"), align = "r")%>%
  kable_styling(full_width = "T", 
                latex_options = c("scale_down"),
                position = "center")

Employed individuals and jobs

The origin-destination information consists of a cross-tabulation of people who are employed full-time by place of residence (origin) and places of employment in the GGH (destination) using the GTA06 zoning system. The number of workers and jobs is not equal; the boundaries of the survey are permeable, so workers who reside within the boundaries but travel outside of the boundaries are counted as workers within an origin TAZ, while jobs in TAZ that are filled by workers who reside outside the GGH boundaries are unknown since they were not surveyed. This mismatch results in the total number of workers being r round(sum(ggh_taz$workers)/sum(ggh_taz$jobs),2) times larger than the number of jobs (i.e., r sum(ggh_taz$workers)%>% prettyNum(big.mark = ",") workers to r sum(ggh_taz$jobs)%>% prettyNum(big.mark = ",") jobs). While the 2016 TTS survey boundaries are drawn to minimize the difference between opportunities at the destination and supply at the origins they are still not equal . That said, these data offer a perspective of the potential of home-based trips to places of employment and as such, the number of trips taken are equal to the number of workers in the GGH.

```rNumber of workers (top) and jobs (bottom) in each TAZ in the GGH area as specified by the 2016 TTS data set.", fig.width=8, fig.height=11} tts_workers <- ggplot() + geom_sf(data = ggh_taz, aes(fill= workers), color = NA) + scale_fill_distiller(palette = "Spectral", name = "Full-time employed people", limits = c(0, max(ggh_taz$workers)), na.value = "grey90")+ theme(legend.position = "right", axis.text = element_blank(), panel.grid = element_blank(), panel.background = element_rect(size = 1, color = "black", fill = NA)) + annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset height = unit(0.8, "cm"), width = unit(0.8, "cm"), style = north_arrow_orienteering(line_width = 0.25, line_col = "dimgrey", fill = c("white","dimgrey"))) + annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset height = unit(0.15, "cm"))

tts_jobs <- ggplot() + geom_sf(data = ggh_taz, aes(fill= jobs), color = NA) + scale_fill_distiller(palette = "Spectral", trans="sqrt", name = "Full-time jobs \n(square root scale)", limits = c(0, max(ggh_taz$jobs)), na.value = "grey90")+ theme(legend.position = "right", axis.text = element_blank(), panel.grid = element_blank(), panel.background = element_rect(size = 1, color = "black", fill = NA))+ annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset height = unit(0.8, "cm"), width = unit(0.8, "cm"), style = north_arrow_orienteering(line_width = 0.25, line_col = "dimgrey", fill = c("white","dimgrey"))) + annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset height = unit(0.15, "cm"))

tts_workers / tts_jobs

```rThe cumulative distribution of the number of jobs and workers per TAZ from the 2016 TTS data set. Light blue shaded ranges correspond to all cumulative proabilities where the number of workers per TAZ are larger than jobs per TAZ. "}

Number <- rbind(ggh_taz$jobs %>% data.frame(),
                 ggh_taz$workers %>% data.frame())
Group <- rbind(rep("Jobs", each=length(ggh_taz$jobs)) %>% data.frame(),
                rep("Workers", each=length(ggh_taz$workers))%>% data.frame())

ecdf_data <- cbind(Number, Group)
colnames(ecdf_data) <- c("Number", "Group")

rect1 <- data.frame(xmin=115, xmax=3055, ymin=-Inf, ymax=Inf)

ggplot(ecdf_data , aes(x=Number, col=Group)) + 
  # geom_segment(aes(x = 115, y = 0, xend = 115, yend = 0.34), col = "blue", linetype=2) +
  # geom_segment(aes(x = 3055, y = 0, xend = 3055, yend = 0.94), col = "blue", linetype=2) +
  geom_rect(data = rect1, aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax), fill="lightblue", alpha=0.2, inherit.aes = FALSE) +
  stat_ecdf() + 
  scale_x_continuous(trans='sqrt', breaks=c(115, 3055, 10000, 20000, 30000, 40000)) + 
  scale_y_continuous(breaks=c(0, 0.25,0.34, .50, .75,0.94, 1.00)) +
  labs(x = "Opportunities per TAZ",
       y = "Cumulative probability") + 
      scale_color_manual("Opportunities", 
                         values = c("Jobs" = "Black",
                                    "Workers" = "Red"))+
  scale_linetype_manual("Opportunities",values=c(0,1)) +
  geom_point(aes(x = 115, y = 0.34), col = "blue")+
  geom_point(aes(x=3055, y=0.94), col = "blue") +
  theme_minimal() +
  theme(plot.title = element_text(hjust=0.5),
        legend.position = c(0.90,0.75),
        axis.line = element_line(colour = "black"),
        panel.border = element_blank()) 

Figure \ref{fig:tts-workers-jobs-plot} presents the number of workers and jobs per TAZ. It can be observed that the spatial distribution of jobs and workers is unequal, which is indicative of jobs-housing imbalance, which can impact accessibility in a region [@Levine1998rethinking]. Workers are concentrated in TAZ within the center of the GGH and along the south-east and northern boarder of the GGH. The center of the GGH corresponds to the Greater Toronto Area (GTA) which is the most densely populated area in southern Ontario [@statistics_canada_daily_2022]. The south east border of the GGH neighbours Lake Ontario and is delineated by the urban built boundary of the Ontario Growth Plan being home to the highest density of working population in the GGH [@ontario_built_2019;@auditor_general_of_ontario_value_2021]. The northern GGH border corresponds to the Simcoe, Dufferin, Kawartha Lake, and Peterborough regions which are home to lower density of worker population density population [@auditor_general_of_ontario_value_2021]. Conversely, the spread of jobs in the GGH is lower than the number of workers indicating population is more spatially dispersed than jobs.

It can also be seen that from the bottom plot in Figure \ref{fig:tts-workers-jobs-plot} that high to medium-low concentrations of jobs are often present in the same areas as workers but only when the scale is transformed. In other words, though there is a higher number of TAZ with no workers than zones with no jobs (i.e., r ggh_taz %>% st_drop_geometry() %>% count(workers) %>% filter(workers == 0) %>% pull('n') TAZ with no workers : r ggh_taz %>% st_drop_geometry() %>% count(jobs) %>% filter(jobs == 0) %>% pull('n') TAZ with no jobs) and the mean of workers per TAZ is higher than the mean of jobs (i.e., r round(mean(ggh_taz$workers, na.rm=T),0) workers : r round(mean(ggh_taz$jobs, na.rm=T),0) jobs) the number TAZ with an extreme number of jobs at the highest and lowest percentiles is significantly higher than the number of workers; see the following cumulative probability distribution in Figure \ref{fig:ECD-plot} in which the 94th to 100th percentile and the 0th to 34th percentile of jobs in TAZ is higher than the number of workers in TAZ. This means that between these ranges, TAZ have a higher number of workers than they do jobs, echoing the more even spatial distribution of workers observed in Figure \ref{fig:tts-workers-jobs-plot}.

\newpage

Calculated travel time

{AccessPack} also includes travel time data for each home-to-work trip as displayed in Figure \ref{fig:plot-tt-ttpertrip}. This travel time corresponds to a car commute calculated using the R package {r5r} and is interpreted as the travel time for a work commute for full-time employed people in the GGH. The travel times were calculated assuming the following input parameters: a 7:00 am departure on Wednesday October 20th 2021, a maximum travel time less than or equal to 180 mins (3 hrs), and a street network retrieved from OpenStreetMaps for the GGH area. The 7:00 am departure corresponds to a typical AM home-to-work departure and the 3 hr threshold was selected as it captures 99% of the trip taken (see the travel times summarized in Table \ref{tab:TTS-16-desc-stats}).

It is important to note that all travel times within this data set are calculated assuming car travel and one departure time for all origins. These assumptions are not completely realistic since we know a small proportion of trips are taken by non-car modes and travel time departures varies, however, it is not possible from the data retrieval system to obtain higher order tabulations that would allow us to split the data further. Thus we do not know which trips are made with non-car modes nor exact departure times in these tables. Though modal split and travel times can be estimated through other methods [e.g., @allen_suburbanization_2021; @higgins2021changes], for simplicity, we assume that all trips are taken by one-time departure car trip.

# remove all NA trips from dataset and set all 0min travel times to 0.1 min
od_ft_tt  <- AccessPack::od_ft_tt %>% 
  filter( !is.na(travel_time)) %>% 
  mutate(travel_time = ifelse(travel_time == 0, 0.1, travel_time))

all_tt <- od_ft_tt  %>% 
  dplyr::select(trips, travel_time)

all_tt <- all_tt[rep(seq_len(dim(all_tt)[1]), all_tt$trips), 2]
#fitting impedance function
gamma_ <- fitdistrplus::fitdist(data=all_tt, "gamma", method="mle", optim.method="Nelder-Mead") 
# transfer calibrated impedance function values to OD matrix
od_ft_tt <- od_ft_tt %>%
  mutate(f = dgamma(travel_time, gamma_$estimate["shape"], gamma_$estimate["rate"]))

#add the number of jobs and workers to the od_ft_tt matrix
od_ft <- od_ft_tt %>% merge(ggh_taz %>% dplyr::select(GTA06, workers) %>% st_drop_geometry(),
                   by.x = "Origin", by.y="GTA06", all.x = TRUE)

od_ft <- od_ft %>% merge(ggh_taz %>% dplyr::select(GTA06, jobs) %>% st_drop_geometry(),
                   by.x = "Destination", by.y="GTA06", all.x = TRUE)


#calculate accessibility for workers from any origin to jobs in Toronto 
GGH_c_accessibility <- od_ft %>% 
  mutate(GGH_A_ij = f * jobs) %>%
  group_by(Origin) %>%
  summarise(GGH_A_i = sum(GGH_A_ij, na.rm = T))

#Merge TO accessibly calculation to the ggh_taz:
GGH_taz_acc <- ggh_taz %>% merge(GGH_c_accessibility, by.x=c("GTA06"), by.y=c("Origin"), all.x=T) 
#calculate spatial availability
GGH_od_ft <- od_ft %>%
  mutate(catch = 1) %>%
  mutate(GGH_V_ij = sp_avail(., 
                         o_id = Origin,
                         d_id = Destination,
                         pop = workers,
                         opp = jobs,
                         r = catch,
                         f = f))

#verify that the sum of all jobs is consistent with the number of jobs
sum(GGH_od_ft$GGH_V_ij, na.rm=T)
sum_jobs <- GGH_od_ft %>% group_by(Destination) %>% summarise(jobs = mean(jobs))
sum(sum_jobs$jobs, na.rm = T)

#aggregating spatial availability  
GGH_availability <- GGH_od_ft %>%
  group_by(Origin) %>%
  summarize(GGH_V_i = sum(GGH_V_ij),
            GGH_sum_tt_i = sum(travel_time),
            GGH_tt_trips_i = mean(travel_time),
            GGH_sum_f_i = sum(f),
            GGH_f_trips_i = mean(f))

#Merge TO availability calculation to the TAZ sf object created for accessibility above:
GGH_taz_acc <- GGH_taz_acc %>% merge(GGH_availability, by.x=c("GTA06"), by.y=c("Origin"), all.x=T) 

```rEstimated total travel time (top) and average travel time per worker (bottom) for TAZ in the GGH.", fig.width=8, fig.height=11, message=FALSE} tts_total_tt <- ggplot() + geom_sf(data = GGH_taz_acc, aes(fill= GGH_sum_tt_i), color = NA) + #data scale_fill_distiller(palette = "Spectral", #legend scale bar name = "Total travel time \n(min)", na.value = "grey90") + theme(legend.position = "right", axis.text = element_blank(), panel.grid = element_blank(), panel.background = element_rect(size = 1, color = "black", fill = NA))+ annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset height = unit(0.8, "cm"), width = unit(0.8, "cm"), style = north_arrow_orienteering(line_width = 0.25, line_col = "dimgrey", fill = c("white","dimgrey"))) + annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset height = unit(0.15, "cm"))

tts_tt_per_trip <- ggplot() + geom_sf(data = GGH_taz_acc, aes(fill= GGH_tt_trips_i), color = NA) + #data scale_fill_distiller(palette = "Spectral", #legend scale bar name = "Average travel time \n(min)", na.value = "grey90") + theme(legend.position = "right", axis.text = element_blank(), panel.grid = element_blank(), panel.background = element_rect(size = 1, color = "black", fill = NA))+ annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset height = unit(0.8, "cm"), width = unit(0.8, "cm"), style = north_arrow_orienteering(line_width = 0.25, line_col = "dimgrey", fill = c("white","dimgrey"))) + annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset height = unit(0.15, "cm"))

tts_total_tt / tts_tt_per_trip

\newpage

As can be observed in Figure \ref{fig:plot-tt-ttpertrip}, the total travel time (min) resembles the spatial trend distribution in the number of employed people in the previous plot (Figure {fig:tts-workers-jobs-plot}). However, when the average travel time per trip in each TAZ is presented, the spatial distribution is distinct from all other plots presented so far. We can see that in areas around the south-eastern border that make up the Greater Toronto and Hamilton Area (GTHA) (e.g., Hamilton, Halton, Peel, Toronto, York, Durham) and Niagara and Waterloo, the average travel times are moderately low. Further from these areas, travel times are higher. Interestingly, even in eastern areas (e.g., Peterborough) with high employment and high job concentration, average travel time is higher than within the GTHA.

## Calibrating an impedance function

Impedance functions are useful to understand mobility behavior and are part, for instance, of gravity models of spatial interaction [] <!--- References ---> and accessibility analysis [] <!--- References ---> . An origin destination matrix and a cost matrix (i.e., with travel times) can be used in combination to estimate impedance functions. An impedance function $f(\cdot)$ depends on the cost of travel between locations $i$ and $j$ $c_{ij}$; it usually is a monotonically decreasing function, although sometimes the function can increase to reflect patterns of separation between activities; for instance, separation of land uses means that very short commuting trips are relatively rare. There can be sometimes also fluctuations due to hierarchical patterns, where travelers bypass opportunities in favor of more distant destinations that offer economies of agglomeration.

A useful technique to callibrate an impedance function is to use the trip length distribution (TLD) as measured from origin-destination data [@horbachov_theoretical_2018; @batista_estimation_2019]. The TLD is the representation of the likelihood that a proportion of trips are taken at a specific travel cost. In our data set, where we assume cost is travel time, the impedance function maps low travel times to higher proportions of trips, and high travel times are mapped to low proportion of trips.

In the GGH data presented, the empirical TLD (i.e., proportion of trips taken vs. travel time in minutes) is fitted to a density distribution using maximum likelihood techniques and the Nelder-Mead method for direct optimization available within the `fitdistrplus` package [@fitdistrplus_2015]. Based on goodness-of-fit criteria and diagnostics seen in Figure \ref{fig:TLD-Gamma-plot}, the gamma distribution is selected for the presented data (also see Figure \ref{fig:plot-cullen-frey} in the Appendix). 

```r
# For some reason plot(gamma_) does not play well with knitr, so instead we save the figure and then include it as a graphic in the following chunk
png("images/impedance_function.png")
plot(gamma_)
dev.off()

```rEmpirical TTS 2016 home-based car trip length distribution (black) and calibrated gamma distribution impedance function (red) with associated Q-Q and P-P plots"} knitr::include_graphics("images/impedance_function.png")

The resulting calibrated impedance function is given in the following general form where the estimated 'shape' is $\alpha$ = `r round(gamma_$estimate[1], 3)`, the estimated 'rate' is $\beta$ = `r round(gamma_$estimate[2], 3)` , and $\Gamma(\alpha)$ is defined in Equation (\ref{gamma-dist}).

\begin{equation}
\label{gamma-dist}
\begin{array}{l}\ 
f(x, \alpha, \beta) = \frac {x^{\alpha-1}e^{-\frac{x}{\beta}}}{ \beta^{\alpha}\Gamma(\alpha)} \quad \text{for } 0 \leq x \leq \infty\\
\Gamma(\alpha) =  \int_{0}^{\infty} x^{\alpha-1}e^{-x} \,dx\\
\end{array}
\end{equation}

```r
# remove all NA trips from dataset and set all 0min travel times to 0.1min
od_ft_tt  <- od_ft_tt %>% filter( !is.na(travel_time)) %>% mutate(travel_time = ifelse(travel_time == 0, 0.1, travel_time))
all_tt <- od_ft_tt  %>% dplyr::select(trips, travel_time)
all_tt <- all_tt[rep(seq_len(dim(all_tt)[1]), all_tt$trips), 2]

```rCullen and frey graphy for the 2016 TTS calculated travel times."} fitdistrplus::descdist(data=all_tt)

# Accessibility to employment

As noted above, impedance functions are an essential component of accessibility analysis. Equation (\ref{eq:conventional-accessibility}) shows that accessibility $A_i$ is the weighted sum of opportunities that can be reached from location $i$, given the cost of travel $c_{ij}$. Summing the opportunities in the neighborhood of $i$ as defined by the impedance function $f(\cdot)$, provides estimates of the number of opportunities that can be reached from $i$ at a certain cost. 

\begin{equation}
\label{eq:conventional-accessibility}
A_i = \sum_{j=1}^JO_jf(c_{ij})
\end{equation}

\noindent where:

-   $A$ is accessibility. 
-   $i$ is a set of origin locations.
-   $j$ is a set of destination locations.
-   $O_j$ is the number of opportunities at location $j$. These are opportunities for activity and add some sort of *supply* to the area;
-   $c_{ij}$ is a measure of the cost of moving between $i$ and $j$
-   $f(\cdot)$ is an impedance function of $c_{ij}$.

Accessibility is a property of the origin (i.e., origin TAZ in our data), the landscape of opportunities (i.e., the TAZ destinations in our data), and transportation infrastructure (i.e., the travel time table). Figure \ref{fig:plot-access-SA-GGH-TTS} presents the accessibility estimates that result from the impedance function calibrated in the preceding section. The plot follows a distinct radial trend where the majority of TAZ in Toronto have high accessibility values and values gradually decrease in TAZ which are further from Toronto's boundary (show in grey outline).

```rCalculated accessibility (top) and spatial availability (bottom) of employment in the GGH area", fig.width=8, fig.height=11, message=FALSE}
## accessibility

#creating the main plot
mplot_access_TTS_GGH <- ggplot() +
  geom_sf(data = GGH_taz_acc, aes(fill= GGH_A_i), color = NA) + #data
    scale_fill_distiller(palette = "Spectral", #legend scale bar
                         name = "Accessibility \n(A_i)",
                         na.value = "grey90") +
  geom_sf(data = toronto_muni_bound, # border for Toronto
          colour=alpha("dimgrey",1), 
          size = 0.5, fill=NA, 
          show.legend = "polygon") + 
  theme(legend.position = "right", axis.text = element_blank(),
        panel.grid = element_blank(),
        panel.background = element_rect(size = 1, 
                                        color = "black", 
                                        fill = NA))+
  annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset
                         height = unit(0.8, "cm"), 
                         width = unit(0.8, "cm"),
                         style = north_arrow_orienteering(line_width = 0.25,
                                                          line_col = "dimgrey", 
                                                          fill = c("white","dimgrey"))) +
  annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset
                   height = unit(0.15, "cm"))

## spatial availability 

mplot_SA_TTS_GGH <- ggplot() +
  geom_sf(data = GGH_taz_acc, aes(fill= GGH_V_i), color = NA) + #data
    scale_fill_distiller(palette = "Spectral", #legend scale bar
                         name = "Spatially Availability \n(V_i)",
                         na.value = "grey90") + 
  geom_sf(data = toronto_muni_bound, # border for Toronto
          colour=alpha("dimgrey",1), 
          size = 0.5, fill=NA, 
          show.legend = "polygon") + 
  theme(legend.position = "right", axis.text = element_blank(),
        panel.grid = element_blank(),
        panel.background = element_rect(size = 1, 
                                        color = "black", 
                                        fill = NA))+
  annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset
                         height = unit(0.8, "cm"), 
                         width = unit(0.8, "cm"),
                         style = north_arrow_orienteering(line_width = 0.25,
                                                          line_col = "dimgrey", 
                                                          fill = c("white","dimgrey"))) +
  annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset
                   height = unit(0.15, "cm"))

mplot_access_TTS_GGH / mplot_SA_TTS_GGH

\newpage

benchmark_GGH_V_i_workers <- GGH_taz_acc %>% st_drop_geometry() %>% summarise(avg_VO = sum(GGH_V_i, na.rm = TRUE)/sum(workers, na.rm = TRUE)) %>% as.numeric()

Alternatively, Figure \ref{fig:plot-avail-GGH-TTS-per-worker} presents spatial availability normalized by worker in which pinkish-red is above average job access, blueish is below average job access, and white is average job access (i.e., r round(benchmark_GGH_V_i_workers,2) jobs per worker). The trends seen in the spatial availability plot in the previous plot (Figure \ref{fig:plot-access-SA-GGH-TTS}) are identical, however, normalizing the access value by simply dividing the spatial availability value for each TAZ by the worker population in that TAZ results in a meaningful job access landscape. It should be noted, that since opportunities are proportionally allocated to each origin, the total sum of job access is equal to the total sum of jobs. So, the average spatial availability of r round(benchmark_GGH_V_i_workers,2) jobs per worker means each full-time worker in a TAZ with average spatial availability have access to r round(benchmark_GGH_V_i_workers,2) jobs.

```rCalculated spatial availability of employment, per worker, from origins to destinations in the GGH.", fig.width=7, message = FALSE}

mplot_SApW_TTS_GGH <- ggplot() + geom_sf(data = GGH_taz_acc, aes(fill= GGH_V_i/workers), color = NA) + #data scale_fill_gradient2(low = "deepskyblue4", mid = "ghostwhite", high = "red", #legend scale bar name = "Spatially Availability \n per Worker (v_i)", limits = c(0, max(GGH_taz_acc$GGH_V_i/GGH_taz_acc$workers)), midpoint= benchmark_GGH_V_i_workers, #average V_i per capita na.value = "grey90") + geom_sf(data = toronto_muni_bound, # border for Toronto colour=alpha("dimgrey",1), size = 0.5, fill=NA, show.legend = "polygon") + theme(legend.position = "right", axis.text = element_blank(), panel.grid = element_blank(), panel.background = element_rect(size = 1, color = "black", fill = NA))+ annotation_north_arrow(location = "tl", # north arrow for both the main plot and inset height = unit(0.8, "cm"), width = unit(0.8, "cm"), style = north_arrow_orienteering(line_width = 0.25, line_col = "dimgrey", fill = c("white","dimgrey"))) + annotation_scale(bar_cols = c("dimgrey", "white"), # scale bar for both the main plot and inset height = unit(0.15, "cm"))

mplot_SApW_TTS_GGH ```

\newpage

Concluding remarks

The open data product introduced in this paper shares tables for population-to-employment data from the 2016 TTS aggregated by TAZ. In addition, inter-centroid travel time tables complement these data. One possible use of these data is to calibrate impedance functions which in turn can be used for accessibility analysis

New digital formats are increasingly complex and the explanation of the methods often do not concisely and intelligibly fit within the confines of a traditional article. With this motivation, we invite all who are interested to use AccessPack to explore population-employment patterns in the largest metropolitan region of Canada. In the spirit of novel and original research, we hope readers value the efforts made to detail the data in greater depth in order to improve transparency in our work and encourage others to replicate and, hopefully, inspire research of their own.

\newpage

References {#references .unnumbered}



soukhova/AccessPack documentation built on March 20, 2022, 6:23 p.m.