In this article we will look at the Warehouse Location Problem. Given a set of customers and set of locations to build warehoses the task is to decide where to build warehouses and from what warehouses goods should be shipped to which customer.

Thus there are two decisions that need to made at once: where and if to build warehouses and the assignment of customers to warehouses. This simple setting also implies that at least one warehouse must be built and that any warehouse is big enough to serve all customers.

As a practical example: you run the logistics for an NGO and want to regularly distribute goods to people in need. You identified a set of possible locations to set up your distribution hubs, but you are not sure where to build them. Then such a model might help. In practice however you might need to incorporate additional constraints into the model.

We start with a set of customers $C = {1 \ldots n}$ and a set of possible warehouses $W = {1 \ldots m}$ that could be built. In addition we have a cost function giving us the transportation cost from a warehouse to a customer. Furthermore there is a fixed cost associated with each warehouse if it will be built. This basic version of the warehouse location problem is adapted from the German Wikipedia page about the problem.

$$
\begin{equation*}
\begin{array}{[email protected]{}ll}
\text{min} & \displaystyle\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{m}\operatorname{transportcost} {i,j} \cdot x{i, j} + \sum\limits_{j=1}^{m}\operatorname{fixedcost}{j} \cdot y{j}& &\
\text{subject to} & \displaystyle\sum\limits_{j=1}^{m} x_{i, j} = 1 & i=1 ,\ldots, n&\
& \displaystyle x_{i, j} \leq y_j, & i=1 ,\ldots, n & j=1 ,\ldots, m&\
& x_{i,j} \in {0,1} &i=1 ,\ldots, n, & j=1 ,\ldots, m \
& y_{j} \in {0,1} &j=1 ,\ldots, m&
\end{array}
\end{equation*}
$$

The first thing we need is the data. In this article we will simply generate artificial data.

We assume the **customers** are located in a grid with euclidian distances:

set.seed(1234) grid_size <- 1000 n <- 100 customer_locations <- data.frame( id = 1:n, x = round(runif(n) * grid_size), y = round(runif(n) * grid_size) )

The **warehouses** are also randomly placed on the grid. The fixed cost for the warehouses are randomly generated as well with mean cost of 10,000.

m <- 20 warehouse_locations <- data.frame( id = 1:m, x = round(runif(m) * grid_size), y = round(runif(m) * grid_size) ) fixedcost <- round(rnorm(m, mean = grid_size * 10, sd = grid_size * 5))

The fixed costs to set up a warehouse are the following:

fixedcost

Next step is to build a functions that takes a customer and a warehouse and returns the transport cost.

transportcost <- function(i, j) { customer <- customer_locations[i, ] warehouse <- warehouse_locations[j, ] round(sqrt((customer$x - warehouse$x)^2 + (customer$y - warehouse$y)^2)) } transportcost(1, 3)

Now let's plot everything. Black dots are customers and red dots are possible warehouse locations.

library(ggplot2) p <- ggplot(customer_locations, aes(x, y)) + geom_point() + geom_point(data = warehouse_locations, color = "red", alpha = 0.5, shape = 17) + scale_x_continuous(limits = c(0, grid_size)) + scale_y_continuous(limits = c(0, grid_size)) + theme(axis.title = element_blank(), axis.ticks = element_blank(), axis.text = element_blank(), panel.grid = element_blank()) p + ggtitle("Warehouse location problem", "Black dots are customers. Light red triangles show potential warehouse locations.")

The model with `rmpk`

then looks like this:

library(rmpk) library(magrittr) library(ROI.plugin.glpk) model <- MIPModel(ROI_solver("glpk")) # 1 iff i gets assigned to warehouse j x <- model$add_variable(i = 1:n, j = 1:m, type = "binary") # 1 iff warehouse j is built y <- model$add_variable(j = 1:m, type = "binary") # maximize the preferences model$set_objective(sum_expr(transportcost(i, j) * x[i, j], i = 1:n, j = 1:m) + sum_expr(fixedcost[j] * y[j], j = 1:m), "min") # every customer needs to be assigned to a warehouse model$add_constraint(sum_expr(x[i, j], j = 1:m) == 1, i = 1:n) # if a customer is assigned to a warehouse, then this warehouse must be built model$add_constraint(x[i,j] <= y[j], i = 1:n, j = 1:m) model

We will use `glpk`

to solve the above model.

model$optimize()

We solved the problem with an objective value of `r sprintf("%d", model$objective_value())`

.

suppressPackageStartupMessages(library(dplyr)) matching <- model$get_variable_value(x[i, j]) %>% filter(value > .9) %>% select(i, j)

The last step is to add the assignments to the previous plot we generated.

plot_assignment <- matching %>% inner_join(customer_locations, by = c("i" = "id")) %>% inner_join(warehouse_locations, by = c("j" = "id")) customer_count <- matching %>% group_by(j) %>% summarise(n = n()) %>% rename(id = j) plot_warehouses <- warehouse_locations %>% mutate(costs = fixedcost) %>% inner_join(customer_count, by = "id") %>% filter(id %in% unique(matching$j)) p + geom_segment(data = plot_assignment, aes(x = x.y, y = y.y, xend = x.x, yend = y.x)) + geom_point(data = plot_warehouses, color = "red", size = 3, shape = 17) + ggrepel::geom_label_repel(data = plot_warehouses, aes(label = paste0("fixed costs:", costs, "; customers: ", n)), size = 2, nudge_y = 20) + ggtitle(paste0("Cost optimal warehouse locations and customer assignment"), "Big red triangles show warehouses that will be built, light red are unused warehouse locations. Dots represent customers served by the respective warehouses.")

The fixed costs for setting up the `r n_distinct(matching$j)`

warehouses is:

sum(fixedcost[unique(matching$j)])

Do you have any questions, ideas, comments? Or did you find a mistake? Let's discuss on Github.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.