knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 6, out.width = "100%", #dpi = 72, fig.retina = 1, eval = TRUE )
The flownet package provides efficient tools for transportation modeling, specifically network processing/preparation, route enumeration, and traffic assignment tasks. The package implements the path-sized logit (PSL) model for traffic assignment, which accounts for route overlap when assigning traffic flows to network edges. It can handle directed or undirected graphs, and it's network processing functions/algorithms support multimodal networks. A key design decision was to represent network graphs using simple data frames with from and to node columns. This facilitates applied work, e.g., further manual network edits such as adding multimodal connector links, and the development of generalized cost functions using network features.
This vignette, while aiming to comprehensively cover the package, does not really do justice to it as CRAN noes not permit long-running computations (often required with the PSL) or complex visual outputs (network plots >= 1Mb in size) inside package vignettes. Therefore, I have also written a comprehensive release blog post covering PSL modelling on the African and US road networks. This post should also be consulted and treated as complementary to this vignette.
Let's start by loading the required packages and exploring the example datasets included with flownet:
library(fastverse) fastverse_extend(flownet, sf, tmap) tmap_mode("plot")
The package includes four example datasets for Africa:
africa_network: A road transport network with 2,825 LINESTRING features representing existing roads (2,344 edges) and proposed new links (481 edges). Each edge includes attributes such as distance, travel duration, border crossing costs, terrain ruggedness, and road upgrade costs.
africa_cities_ports: 453 African cities with population > 100,000 and international ports. Includes population data, capital status, and port cargo outflows.
africa_segments: 14,358 raw network segments representing intersected road routes. Useful for demonstrating network consolidation and simplification functions.
africa_trade: Bilateral trade flows between 47 African countries aggregated by HS section (21 product categories). Values represent annual averages over 2012-2022.
The africa_network, africa_cities_ports, and africa_segments datasets are from Krantz, S. (2024). Optimal Investments in Africa's Road Network. Policy Research Working Paper 10893. World Bank. Replication materials are available at github.com/SebKrantz/OptimalAfricanRoads.
Let's examine the data:
# View network structure (existing links only) africa_net <- fsubset(africa_network, !add, -add) str(africa_net, max.level = 1) # View cities/ports head(fselect(africa_cities_ports, city, country, population)) # View trade data structure head(africa_trade)
This section demonstrates a complete workflow from network preparation to traffic assignment and visualization. We'll use a gravity-based OD matrix derived from city populations.
First, let's visualize the network to understand its structure:
# Plot network colored by travel speed tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(africa_net) + tm_lines(col = "speed_kmh", col.scale = tm_scale_continuous(values = "turbo", values.range = c(0.1, 0.9)), col.legend = tm_legend("Speed (km/h)", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1, item.height = 2.5), lwd = 1.5) + tm_layout(frame = FALSE)
The linestrings_to_graph() function converts LINESTRING geometries to a graph data frame format required for traffic assignment. In this case, the relevant columns are already provided, thus we only need to strip the geometry column and turn it into a normal data frame.
# Convert to graph graph <- st_drop_geometry(africa_net) head(graph)
The resulting graph data frame contains:
edge: Edge identifier (optional)from, to: Node IDsFX, FY, TX, TY: Node coordinates (optional)Next, we need to map the city/port locations to the nearest network nodes:
# Extract nodes with spatial coordinates nodes <- nodes_from_graph(graph, sf = TRUE) # Map cities/ports to nearest nodes nearest_nodes <- nodes$node[st_nearest_feature(africa_cities_ports, nodes)]
For this example, we'll create a simple gravity-based OD matrix from city populations. The melt_od_matrix() function converts the matrix to long format required by run_assignment():
# Create gravity-based OD matrix (population product scaled down) od_mat <- outer(africa_cities_ports$population, africa_cities_ports$population) / 1e12 dimnames(od_mat) <- list(nearest_nodes, nearest_nodes) # Convert to long format od_matrix_long <- melt_od_matrix(od_mat) head(od_matrix_long)
The resulting data frame contains columns from, to, and flow, with only positive, finite flow values.
Now we can run the traffic assignment. The run_assignment() function:
For large networks like this one, we'll use the All-or-Nothing (AoN) method which is faster:
# Run Traffic Assignment (All-or-Nothing method for speed) result <- run_assignment(graph, od_matrix_long, cost.column = "duration", method = "AoN", return.extra = "all") print(result)
Key parameters:
cost.column: The cost column to use for route computation (here "duration" in minutes)method: "PSL" (path-sized logit) or "AoN" (all-or-nothing)return.extra: Additional results to return ("all" returns everything)Finally, we can visualize the assigned flows on the network:
# Add flows to network for visualization africa_net$final_flows_log10 <- log10(result$final_flows + 1) tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(africa_net) + tm_lines(col = "final_flows_log10", col.scale = tm_scale_continuous(values = "brewer.yl_or_rd"), col.legend = tm_legend("Log10 Flows", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_shape(africa_cities_ports) + tm_dots(size = 0.15, fill = "grey30") + tm_layout(frame = FALSE)
For more realistic modeling, we can disaggregate country-level trade flows to city-level using population shares. This approach distributes trade between countries across their respective cities proportionally to population.
# Compute each city's share of its country's population city_pop <- st_drop_geometry(africa_cities_ports) |> fcompute(node = nearest_nodes, city = qF(city_country), pop_share = fsum(population, iso3, TRA = "/"), keep = "iso3") head(city_pop)
# Aggregate trade to country-country level (sum across HS sections) trade_agg <- africa_trade |> collap(quantity ~ iso3_o + iso3_d, fsum) # Join with city population shares for origin and destination # add_stub adds suffix to all columns, so iso3 -> iso3_o matches trade_agg$iso3_o od_matrix_trade <- trade_agg |> join(city_pop |> add_stub("_o", FALSE), multiple = TRUE) |> join(city_pop |> add_stub("_d", FALSE), multiple = TRUE) |> fmutate(flow = quantity * pop_share_o * pop_share_d) |> frename(from = node_o, to = node_d) |> fsubset(flow > 0 & from != to) head(od_matrix_trade)
# Run Traffic Assignment with trade-based OD matrix result_trade <- run_assignment(graph, od_matrix_trade, cost.column = "duration", method = "AoN", return.extra = "all") print(result_trade)
# Add flows to network for visualization africa_net$final_flows_log10 <- log10(result_trade$final_flows + 1) tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(africa_net) + tm_lines(col = "final_flows_log10", col.scale = tm_scale_continuous(values = "brewer.yl_or_rd"), col.legend = tm_legend("Log10 Trade Flows", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_shape(africa_cities_ports) + tm_dots(size = 0.15, fill = "grey30") + tm_layout(frame = FALSE)
For large or complex networks, it's often beneficial to consolidate the graph by removing intermediate nodes and merging edges. This simplifies the network topology while preserving connectivity, which can significantly improve computational performance.
The africa_segments dataset provides raw network segments that can be processed through the full consolidation workflow. It was obtained by computing fastest car routes between all of the 453 African (port-)cities that are less than 2000km apart using osrm, feeding them through stplanr::overline() and rmapshaper::ms_simplify(), and then just taking the start and end coordinate of each segment. The final network africa_network used above was created using these segments in a similar way to how I will demonstrate below (consolidation followed by clustering simplification and a final call to osrm to get simplified edge geometries for better visual appearance).
# Convert segments to sf and then to graph graph_seg <- africa_segments |> linestrings_from_graph() |> linestrings_to_graph() |> create_undirected_graph() # Get nodes and map cities nodes_seg <- nodes_from_graph(graph_seg, sf = TRUE) nearest_nodes_seg <- nodes_seg$node[st_nearest_feature(africa_cities_ports, nodes_seg)] cat("Original segments:", nrow(graph_seg), "\n")
The consolidate_graph() function recursively removes intermediate nodes (nodes that occur exactly twice) and merges connecting edges. At each step, network attributes are aggregated using collap(), which, by default, uses the mean for numeric and the statistical mode for categorical attributes (columns), and supports weights.
# Consolidate graph, preserving city nodes graph_cons <- consolidate_graph(graph_seg, keep = nearest_nodes_seg, w = ~ .length) cat("After consolidation:", nrow(graph_cons), "\n")
The use of keep here indicates node IDs to preserve (city/port closest nodes).
tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(linestrings_from_graph(graph_seg)) + tm_lines(col = "passes", col.scale = tm_scale_continuous(values = "turbo", values.range = c(0.1, 0.9)), col.legend = tm_legend("N. Passes", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_layout(frame = FALSE) + tm_title(paste("Original:", nrow(graph_seg), "edges"))
The number of passes through a segment along different routes is generated by stplanr::overline(). If all routes were traversed only once, it would indicate traffic flow.
tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(linestrings_from_graph(graph_cons)) + tm_lines(col = "passes", col.scale = tm_scale_continuous(values = "turbo", values.range = c(0.1, 0.9)), col.legend = tm_legend("N. Passes", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_layout(frame = FALSE) + tm_title(paste("Consolidated:", nrow(graph_cons), "edges"))
simplify_network()While consolidate_graph() simplifies the network topology preserving full connectivity, the simplify_network() function provides two methods for further reducing network complexity while preserving connectivity between important nodes:
leaderCluster algorithm and contracts the graphThe two functions are complimentary, after calling simplify_network(), consolidate_graph() could be called again to simplify the topology of the reduced network.
This method computes shortest paths between all specified nodes and retains only the traversed edges:
# Simplify network using shortest paths graph_simple <- simplify_network(graph_cons, nearest_nodes_seg, method = "shortest-paths", cost.column = ".length") cat("Consolidated edges:", nrow(graph_cons), "\n") cat("Simplified edges:", nrow(graph_simple), "\n") # Visualize tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(linestrings_from_graph(graph_simple)) + tm_lines(col = "passes", col.scale = tm_scale_continuous(values = "turbo", values.range = c(0.1, 0.9)), col.legend = tm_legend("N. Passes", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_layout(frame = FALSE) + tm_title(paste("Simplified (SP):", nrow(graph_simple), "edges"))
The cluster method uses the leaderCluster algorithm to spatially group nearby nodes and contract the graph. This is useful for creating coarser network representations:
# Compute node weights for clustering (sum of gravity at each node) node_weights <- rowbind( fselect(graph_cons, node = from, gravity_rd), fselect(graph_cons, to, gravity_rd),use.names = FALSE) |> collap(~ node, fsum) # Cluster-based simplification graph_cluster <- simplify_network(graph_cons, nearest_nodes_seg, method = "cluster", cost.column = node_weights$gravity_rd, radius_km = list(nodes = 30, cluster = 27), w = ~ .length) cat("Clustered edges:", nrow(graph_cluster), "\n") tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(linestrings_from_graph(graph_cluster)) + tm_lines(col = "passes", col.scale = tm_scale_continuous(values = "turbo", values.range = c(0.1, 0.9)), col.legend = tm_legend("N. Passes", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_layout(frame = FALSE) + tm_title(paste("Simplified (CL):", nrow(graph_cluster), "edges"))
Key cluster parameters:
radius_km$nodes: Radius (km) around preserved nodes to assign nearby nodesradius_km$cluster: Clustering radius (km) for remaining nodesLet's see if we can assign to this network:
dimnames(od_mat) <- list(nearest_nodes_seg, nearest_nodes_seg) od_matrix_long <- melt_od_matrix(od_mat) # Run Traffic Assignment with gravity-based OD matrix result_cl <- run_assignment(graph_cluster, od_matrix_long, cost.column = ".length", method = "AoN", return.extra = "all") print(result_cl) # Add flows to network for visualization graph_cluster$final_flows_log10 <- log10(result_cl$final_flows + 1) tm_basemap("CartoDB.Positron", zoom = 4) + tm_shape(linestrings_from_graph(graph_cluster)) + tm_lines(col = "final_flows_log10", col.scale = tm_scale_continuous(values = "brewer.yl_or_rd"), col.legend = tm_legend("Log10 Flows", position = c("left", "bottom"), frame = FALSE, text.size = 0.8, title.size = 1), lwd = 1.5) + tm_shape(africa_cities_ports) + tm_dots(size = 0.15, fill = "grey30") + tm_layout(frame = FALSE)
The above overview showcased essential package functions. The Path-Sized Logit (PSL) assignment model was not used as it takes a few minutes to process around 200k OD-pairs with the PSL, beholding that a selection of routes must be generated for each OD pair. PSL flow assignments are smoother/more distributed, and more realistic than the all-or-nothing method which assigns the entire flow to the shortest path without regard for alternative routes.
The remainder of this vignette provides a theoretical overview of the PSL model, followed by a summary of the key functions of this package.
The Path-Sized Logit (PSL) model (Ben-Akiva and Bierlaire, 1999)^[See further theoretical and practical references at the end.] is a discrete choice model used for traffic assignment that overcomes the independence of irrelevant alternatives (IIA) property of the standard Multinomial Logit (MNL) model. In transport networks, alternative routes often overlap significantly, sharing common edges. The MNL model treats these overlapping routes as independent, leading to an overestimation of flow on shared segments.
The PSL model introduces a correction term (path-size factor) to the utility function of each route, penalizing paths that share links with other alternatives.
The probability $P_k$ of choosing route $k$ from the set of alternative routes $K$ is given by:
$$P_k = \frac{e^{V_k}}{\sum_{j \in K} e^{V_j}}$$
where $V_k$ is the deterministic utility of route $k$, defined in flownet as:
$$V_k = -C_k + \beta_{PSL} \ln(PS_k)$$
Here:
- $C_k$ is the generalized cost of route $k$.
- $\beta_{PSL}$ is the path-sized logit parameter (controlled by the beta argument).
- $PS_k$ is the Path-Size factor for route $k$.
The Path-Size factor $PS_k$ quantifies the uniqueness of route $k$. It is defined as the ratio of the length (or cost) of the links in path $k$ to the number of alternatives sharing those links. The formula implemented in flownet is:
$$PS_k = \frac{1}{C_k} \sum_{a \in \Gamma_k} \frac{c_a}{\delta_a}$$
Where: - $\Gamma_k$ is the set of edges in path $k$. - $c_a$ is the cost of edge $a$. - $C_k = \sum_{a \in \Gamma_k} c_a$ is the total cost of path $k$. - $\delta_a$ is the number of alternative routes in the choice set $K$ that use edge $a$.
Interpretation: - If a path is completely unique ($\delta_a = 1$ for all edges), then $PS_k = 1$ and $\ln(PS_k) = 0$. The utility reduces to the standard MNL utility ($V_k = -C_k$). - If a path shares edges with many other routes ($\delta_a > 1$), then $PS_k < 1$ and $\ln(PS_k) < 0$. - The term $\beta_{PSL} \ln(PS_k)$ acts as a correction. Since $\ln(PS_k)$ is negative for overlapping routes, a positive $\beta_{PSL}$ (default 1) is required to ensure that overlapping paths have lower utility (penalization).
This formulation ensures that the model assigns lower probabilities to routes that are largely duplicates of others, distributing flow more realistically across truly distinct alternatives. In practice, a value of $\beta_{PSL} = 1$ is a sensible default; higher values strengthen the penalization of overlapping routes and lower values make the model behave more like a standard MNL.
linestrings_to_graph(): Converts LINESTRING geometries to a graph data frame format. This is typically the first step when working with spatial network data.
linestrings_from_graph(): Converts a graph data frame back to LINESTRING geometries. Useful for visualization and spatial operations.
create_undirected_graph(): Converts a directed graph to an undirected graph by normalizing edge directions and aggregating duplicate connections. Useful when transport links can be traversed in both directions.
consolidate_graph(): Simplifies network topology by removing intermediate nodes and merging connecting edges. This can significantly reduce computational complexity while preserving connectivity.
simplify_network(): Simplifies the network using one of two methods:
leaderCluster algorithm and contracts the graphnodes_from_graph(): Extracts unique nodes with coordinates from a graph. Essential for mapping zone centroids to network nodes.
normalize_graph(): Normalizes node IDs to consecutive integers starting from 1. This can improve performance with some graph algorithms.
distances_from_graph(): Computes a distance matrix for all node pairs using the graph structure.
melt_od_matrix(): Converts an origin-destination matrix to long format with columns from, to, and flow. This format is required by run_assignment().
run_assignment(): The core function for traffic assignment. Key parameters:
method: "PSL" (path-sized logit) or "AoN" (all-or-nothing). AoN is faster but doesn't account for route overlap.beta (default: 1): Path-sized logit parameter. A positive value penalizes overlapping routes.detour.max (default: 1.5): Maximum detour factor for alternative routes. Higher values consider more routes but increase computation time substantially.angle.max (default: 90): Maximum detour angle in degrees. Routes outside this angle from the straight line between origin and destination are not considered.cost.column: The cost column to use for route computation. Can be a column name or numeric vector.return.extra: Additional results to return, such as paths, edge counts, or distance matrices.detour.max: Start with the default (1.5) and increase if you need to consider more alternative routes. Be aware that higher values can dramatically increase computation time.angle.max: Use this to restrict route enumeration to more direct paths. Smaller values (e.g., 45°) can significantly speed up computation for large networks.beta: The path-sized logit parameter. Typically positive (default 1) to penalize route overlap; values larger than 1 increase the penalty on overlapping routes, and values between 0 and 1 make the model closer to a standard MNL.This vignette has demonstrated the basic and advanced workflows for using the flownet package:
The package provides efficient tools for transport modeling with the path-sized logit model, accounting for route overlap in traffic assignment. The network processing utilities allow you to prepare and simplify networks for efficient computation.
?flownet-packagebeta, detour.max, angle.max)africa_trade to explore trade patterns by different HS sectionssimplify_network() cluster method for coarser network representationsFor more information, see the package documentation and examples in the help files for individual functions.
Ben-Akiva, M., & Bierlaire, M. (1999). Discrete choice methods and their applications to short term travel decisions. In R. W. Hall (Ed.), Handbook of Transportation Science (pp. 5–33). Springer US. https://doi.org/10.1007/978-1-4615-5203-1_2
Cascetta, E. (2001). Transportation systems engineering: Theory and methods. Springer.
Ben-Akiva, M., & Lerman, S. R. (1985). Discrete choice analysis: Theory and application to travel demand. The MIT Press.
Ramming, M. S. (2002). Network knowledge and route choice (Doctoral dissertation). Massachusetts Institute of Technology.
Prato, C. G. (2009). Route choice modeling: Past, present and future research directions. Journal of Choice Modelling, 2(1), 65–100. https://doi.org/10.1016/S1755-5345(13)70005-8
AequilibiaE Python Documentation: https://www.aequilibrae.com/develop/python/route_choice/path_size_logit.html
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.