knitr::opts_chunk$set( collapse = TRUE, comment = "#>" #eval = FALSE ) options(expressions = 50000)
The package "knapsack" is based on laboratory 6 of the course "Advanced R programming" at LiU. It implements 3 different algorithms to solve or at least approximate the optimal solution for the famous knapsack optimization problem:
| Algorithm | Description | Complexity |
|:---------------|:--------------|---------------:|
| Brute Force | Tries out all $2^n$ possible item-combinations | $O(2^n)$ |
| Dynamic Programming | Uses dynamic programming for solving the problem | $O(Wn)$ |
| Greedy Heuristic | Heuristic approach to find a good approximation of the optimal solution | $O(n\ log(n))$ |
In the following paragraphs, we demonstrate the proper usage of the packages functions and also analyze the time complexity of the different algorithms.
The following code shows two examples of the brute force algorithm.
library(knapsack) # we create a dataframe with 2000 possible knapsack items knapsack_objects <- get_knapsack_objects(2000) # Example 1: # now the brute force algorithm is used to solve for the first 8 items with a # maximum weight W of 3500 brute_force_knapsack(x = knapsack_objects[1:8, ], W = 3500) # Example 2: # now the brute force algorithm is used to solve for the first 12 items with a # maximum weight W of 2000 brute_force_knapsack(x = knapsack_objects[1:12,], W = 2000)
# number of items = 16, W = 3500 time <- system.time(brute_force_knapsack(x = knapsack_objects[1:16, ], W = 3500)) time
cat(sprintf(" Computation time for Brute Force (16 items): %#.4f seconds\n\n", time[3]), sprintf("Computation time for Brute Force (18 items): %#.4f seconds\n\n", system.time(brute_force_knapsack(x = knapsack_objects[1:18, ], W = 3500))[3]), sprintf("Computation time for Brute Force (20 items): %#.4f seconds\n\n", system.time(brute_force_knapsack(x = knapsack_objects[1:20, ], W = 3500))[3]))
The following code shows the same two examples as before but uses the dynamic programming algorithm.
# Example 1: # now the dynamic programming algorithm is used to solve for the first 8 items with a # maximum weight W of 3500 knapsack_dynamic(x = knapsack_objects[1:8, ], W = 3500) # Example 2: # now the dynamic programming algorithm is used to solve for the first 12 items with a # maximum weight W of 2000 knapsack_dynamic(x = knapsack_objects[1:12,], W = 2000)
# number of items = 125, W = 3500 time <- system.time(knapsack_dynamic(x = knapsack_objects[1:125, ], W = 3500)) time
cat(sprintf(" Computation time for Dynamic Programming (125 items): %#.4f seconds\n\n", time[3]), sprintf("Computation time for Dynamic Programming (250 items): %#.4f seconds\n\n", system.time(knapsack_dynamic(x = knapsack_objects[1:250, ], W = 3500))[3]), sprintf("Computation time for Dynamic Programming (500 items): %#.4f seconds\n\n", system.time(knapsack_dynamic(x = knapsack_objects[1:500, ], W = 3500))[3]))
The following code shows two examples for the usage of the greedy heuristic, with a higher amount of items.
# Example 1: # now the greedy heuristic is used to solve for the first 800 items with a # maximum weight W of 3500 greedy_knapsack(x = knapsack_objects[1:800,], W = 3500) # Example 2: # now the greedy heuristic is used to solve for the first 1200 items with a # maximum weight W of 2000 greedy_knapsack(x = knapsack_objects[1:1200,], W = 2000)
# we create a dataframe with 1,000,000 possible knapsack items knapsack_objects <- get_knapsack_objects(1000000) # number of items = 10000, W = 3500 time <- system.time(greedy_knapsack(x = knapsack_objects[1:10000, ], W = 3500)) print(time)
cat(sprintf(" Computation time for Greedy Heuristic (10,000 items): %#.4f seconds\n\n", time[3]), sprintf("Computation time for Greedy Heuristic (100,000 items): %#.4f seconds\n\n", system.time(greedy_knapsack(x = knapsack_objects[1:100000, ], W = 3500))[3]), sprintf("Computation time for Greedy Heuristic (1,000,000 items): %#.4f seconds\n\n", system.time(greedy_knapsack(x = knapsack_objects, W = 3500))[3]))
After profiling the two algorithms, Brute Force and Dynamic Programming, an optimized version of both of them was created, to reduce bottlenecks and thus computational time.
Profiling showed, that transposig the "combi-vector" takes a lot of time. But it is just a necessary operation in math but not in programming, just multiplying both vectors yields the same result. So we simply deleted this operation, which lead to reduced compuitational time:
# number of items = 20, W = 3500 time_1 <- system.time(brute_force_knapsack(x = knapsack_objects[1:20, ], W = 3500)) print(time_1) time_2 <- system.time(optimized_brute_force(x = knapsack_objects[1:20, ], W = 3500)) print(time_2)
cat(sprintf(" Computation time for Normal Brute Force Algorithm (20 items): %#.4f seconds\n\n", time_1[3]), sprintf("Computation time for Optimized Brute Force Algorithm (20 items): %#.4f seconds\n\n", time_2[3]), sprintf("Improvement in computational time: %#.4f seconds\n\n", time_1[3]-time_2[3]))
Profiling showed, that the operation "$" takes a lot of time. In the for loop we use it twice to refer to the same value in the dataframe, that is why we can simply store it in a variable, to reduce the number of times the operation is used:
# number of items = 1500, W = 3500 time_1 <- system.time(knapsack_dynamic(x = knapsack_objects[1:1500, ], W = 3500)) print(time_1) time_2 <- system.time(optimized_dynamic_programming(x = knapsack_objects[1:1500, ], W = 3500)) print(time_2)
cat(sprintf(" Computation time for Normal Dynamic Programming Algorithm (1500 items): %#.4f seconds\n\n", time_1[3]), sprintf("Computation time for Optimized Dynamic Programming Algorithm (1500 items): %#.4f seconds\n\n", time_2[3]), sprintf("Improvement in computational time: %#.4f seconds\n\n", time_1[3]-time_2[3]))
To analyze the impact of parallelization on computation time, we integrated an optional parallelized solving approach to the Brute Force Algorithm that uses the mcapply function. By default, brute_force_knapsack still uses non-parallelized processes, but by setting setting the argument parallel to r TRUE
, the parallelized approach is used.
The following code contains two examples, which show the time efficiency gain due to parallelization:
# NORMAL, number of items = 18, W = 3500 system.time(brute_force_knapsack(x = knapsack_objects[1:18, ], W = 3500, parallel = FALSE)) # PARALLEL, number of items = 18, W = 3500 system.time(brute_force_knapsack(x = knapsack_objects[1:18, ], W = 3500, parallel = TRUE))
# NORMAL, number of items = 23 W = 30000 system.time(brute_force_knapsack(x = knapsack_objects[1:23, ], W = 30000, parallel = FALSE)) # PARALLEL, number of items = 23, W = 30000 system.time(brute_force_knapsack(x = knapsack_objects[1:23, ], W = 30000, parallel = TRUE))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.