knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The simR package allows spatial interaction models (SIMs) to be fitted easily. In particular:

Getting started

Firstly - load the package. Because of the 'tidyverse' focus of the package, load this as well:

library(simR)
library(tidyverse)

This package essentially works with geographical data. Although one could use SIMs without drawing any maps, the author feels this is essentially lacking. For that reason, it is also worth loading the sf and tmap packages^[Install these packages if you haven't done already!].

library(sf)
library(tmap)

Sample data for the package relates the Local Authority areas in the East Midlands of the UK. Here is the simplefeatures table showing the areas:

data(eastmid_sf)
tm_shape(eastmid_sf) + tm_borders()

There is also travel-to-work data - basically these are counts of the number of people travelling from one Local Authority area to another (or to itself). This is organised so that each row corresponds to a single origin-destination pair:

data(eastmid_ttw)
eastmid_ttw

The columns are:

Now fit a model:

eastmid_ttw %>% doubly_constrained_sim(From,To,Dist,Count)

What's Going On?

At this stage, this might make little sense unless you are familiar withs SIMs. However, here are a few words of explanation. The doubly_constrained_sim function fits a doubly constrained SIM model - this predicts the interaction between origin and destination - here this is the number of people travelling to work from the origin to the destation, ensuring that the total number of people leaving the origin, and the total number arriving at the destination agree with the observed figures, taking into account a 'cost deterrence' effect, where more costly trips are less likely to happen, given these constraints.

The function works like dplyr type functions, so that the arguments can be specified as column names in a data frame, without having to quote them. It returns an object of class 'sim' (which inherits from the class 'glm').

The model is of the form

$$ \hat{T_{ij}} = k A_iB_j\exp\left(-\frac{ C_{ij}}{\gamma} \right) $$ Where $A_i$ and $B_j$ are 'balancing terms' to make the constraints apply and $\gamma$ is a cost deterrence parameter. 'cost' here is used broadly, and can generally represent any measure of effort needed to travel from zone $i$ to zone $j$. $\hat{T_{ij}}$ is the interaction between zones $i$ and $j$. The 'hat' on the $\hat{T_{ij}}$ denotes that this is the modelled interaction - the actual interaction is $T_{ij}$ (no hat). This is the Count column in the example data set eastmid_ttw. $\gamma$ here corresponds to the 'Cost Denominator Parameter' in the printout. Since units here are in Km this is just over 8 Km. When the deterrence function is exponential, as in the model here, a 'half life' cost can be calculated. Assuming all other characteristics of an origin and destimation are fixed, for every whole number multiple of this cost the amount of interaction is halved. Here this is a distance - and has a value of a little under 6km.

To understand the idea more it is worth reading these two articles by Adam Dennett - https://rpubs.com/adam_dennett/259068 and https://rpubs.com/adam_dennett/257231 - in many ways these are what prompted me to create the simR package.

Note that this is a 'toy' data set - ideally one would use more geographical detail, and a more realistic cost distance - such as travel time. Also, the cost when the origin and the destination are the same is generally not zero! However, the computing procedures are the same. In the fullness of time I'll replace these distances with something more convincing...

Using the SIM Model Object

In the first section, a doubly constrained SIM model was fitted, but nothing really done with it. To do things with it, the object should be stored:

eastmid_ttw %>% doubly_constrained_sim(From,To,Dist,Count) -> sim_model

Then it is possible to add the fitted $\hat{T_{ij}}$ values to the eastmids_ttw data frame.

eastmid_ttw %>% add_sim_fitted(sim_model) -> eastmid_ttw_fitted
eastmid_ttw_fitted

We can now check that the calibration really is constrained to make the total sums going in to the destinations in the fitted values agree with those in the observed counts:

eastmid_ttw_fitted %>% group_by(To) %>% 
  summarise(Count=sum(Count),Fitted=sum(Fitted))

It seems to! The same could be done for the origins.

Finally, make a map. Here we'll filter out the fitted trips to Derby and map them, and compare them to the observed trips.

eastmid_ttw_fitted %>% filter(To=='Derby') -> to_derby
to_derby

To map this, we need to join the table to_derby to the eastmid_sf simple feature table.

eastmid_sf %>% left_join(to_derby,by=c("name"="From")) -> to_derby_sf

So now mapping is possible...

legend_style <- tm_legend(legend.position=c('right','bottom'),
                          legend.text.size=0.4)
tm_shape(to_derby_sf) + tm_polygons(col='Fitted',style='fisher') + legend_style
tm_shape(to_derby_sf) + tm_polygons(col='Count',style='fisher') + legend_style


chrisbrunsdon/simR documentation built on May 5, 2019, 2:41 a.m.