knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(Lab6RT) library(profvis)
For this vignette the data set used in examples are generated from the code below.
set.seed(42) n = 1000000 knapsack_objects <- data.frame( w = sample(1:4000, size = n, replace = TRUE), v = runif(n = n, 0, 10000) )
The knapsack problem is a discrete optimization problem where we have a knapsack that can take a limited weight W and we want to fill this knapsack with a number of items i = 1, ..., n, each with a weight w_i and a value v_i. The goal is to find the knapsack with the largest value of the elements added to the knapsack.
The first algorithm implemented is the brute force algorithm, which is guaranteed to give the correct solution to the problem, but has time complexity of O(2^n), which scales poorly.
brute_force_knapsack(x = knapsack_objects[1:8,], W = 3500)
A more efficient solution is the dynamic programming algorithm, which has time complexity of O(Wn) and can be used if the weights are all discrete values.
knapsack_dynamic(x = knapsack_objects[1:8,], W = 3500)
In order to obtain a faster algorithm, the exactness of the solution is lost, as the greedy heuristic solution only provides an approximation. However, it is much faster than previous algorithms since it runs in O(n log n).
greedy_knapsack(x = knapsack_objects[1:8,], W = 3500)
The package profvis is used to profile the code and examined the time it takes to complete various sections of the functions.
We have tried optimizing the code as much as possible, however, we were unable to find any significant improvements without rewriting the algorithm and making the code itself more complex. This does not mean that there are no performance improvements though, it's just that the small speed improvements can not justify the increasing complexity of the code. If one wants to improve the run time of algorithms, implementations via Rcpp and C++ might prove fruitful. The function intToBits() in R might also provide some reduction in time, but we were unable to get this to work with our solution.
profvis({ brute_force_knapsack(x = knapsack_objects[1:8,], W = 3500) brute_force_knapsack(x = knapsack_objects[1:16,], W = 3500) brute_force_knapsack(x = knapsack_objects[1:16,], W = 3500, fast = TRUE) })
The brute force solution has a time complexity of O(2^n), thus an increase of items from 8 to 16 leads to a sharp increase in time taken to run the algorithm. According to profvis, it takes about 10 ms to run the algorithm for 8 items and about 1000 ms for 16 items, which shows that this algorithm does not scale well. With the argument fast, it is possible to run C++ code for the brute force algorithm, where the run time of the algorithm is drastically reduced (approx. 100 times) since the for loops used in the R code are not as fast as the ones in C++.
profvis({ knapsack_dynamic(x = knapsack_objects[1:8,], W = 3500) knapsack_dynamic(x = knapsack_objects[1:16,], W = 3500) knapsack_dynamic(x = knapsack_objects[1:500,], W = 3500) })
The dynamic programming solution has a time complexity of O(Wn), where both values of W and n have an equal effect on the running time. According to profvis, it takes about 90 ms to run the algorithm for 8 items, about 150 ms for 16 items, and about 4500 ms for 500 items.
profvis({ greedy_knapsack(x = knapsack_objects[1:8,], W = 3500) greedy_knapsack(x = knapsack_objects[1:800,], W = 3500) greedy_knapsack(x = knapsack_objects[1:1000000,], W = 3500) })
The greedy heuristic solution has a time complexity of O(n log n), due to the sorting algorithm used. According to profvis, for 8 and 16 items, the algorithm is almost instantaneous, while it takes about 1100 ms for 1000000 items.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.