The lab6 package provides three different way to approach the knapsack problem, a discrete optimization problem where the goal is to find the knapsack with the largest value of elements added to it from a list of given items. Each item has a value ($v$) and a weight ($w$), the knapsack has a limited given weight $W$.

The three approaches used to solve problem in the package are:

\itemize{ \item Brute force search: brute_force_knapsack(x,W) \item Dynamic programming: knapsack_dynamic(x,W) \item Greedy heuristic: greedy_knapsack(x,W) }

All three functions needs two arguments:

The brute_force_knapsack is the only function with an optional parameter called parallel. It's FALSE by default but if it is set to TRUE then the function should parallelize over the detected cores.

Brute force search: brute_force_knapsack(x,W)

The brute_force_knapsackfunction goes through all the possible alternatives and return the maximum value found. It gives a correct answer to the problem in all the situation but its complexity is $O(2^n)$ because all possible $2^n$ combinations need to be evaluated.

To run the algorithm for $n=16$ objects it takes:

library(devtools)
install_github("alede379/lab6",force=TRUE)
library(lab6)
set.seed(42)
n <- 2000
knapsack_objects <-data.frame(w=sample(1:4000, size = n, replace = TRUE),v=runif(n = n, 0, 10000))
system.time(brute_force_knapsack(x = knapsack_objects[1:16,], W = 3500))

If parallel=TRUE:

system.time(brute_force_knapsack(x = knapsack_objects[1:16,], W = 3500,parallel = TRUE))

the user time it's better compared to the non-parallelized one but the elapsed time increases.

Dynamic programming: knapsack_dynamic(x,W)

The knapsack_dynamic is based on the fact that all weights are nonnegative integers and they are all less than $W$. In this approach it's defined a matrix where in each position m[i,w] is stored the maximum value that can be attained with weight less than or equal to w using items up to i. The maximum value is found calculating m[n,W]. It gives a correct answer to the problem in all the situation but its complexity is $O(Wn)$ less than the brute_force_knapsackfunction.

To run the algorithm for $n=500$ objects it takes:

system.time(knapsack_dynamic(x = knapsack_objects[1:500,], W = 3500))

Greedy Heuristic: greedy_knapsack(x,W)

The greedy_knapsackfunction doesn't give the exact result for the problem but it reduces the computational complexity for the problem. This approach sorts the items in decreasing order of value per unit of weight $v_i/w_i$ and it proceeds to insert them into the sack until there is not more space. The complexity of this problem is $O(nlogn)$.

To run the algorithm for $n=1000000$ objects it takes:

system.time(greedy_knapsack(x = knapsack_objects[1:1000000,], W = 3500))

Improving the code with function as lapply() or parallelizing helps to get faster results and a better time of computation. The time slows down when big structure are called too many times, as big dataframes, to speed up the code it's always better to evitate saving too many unused data.

Examples

Different functions don't give always the same results:

greedy_knapsack(x = knapsack_objects[1:8,], W = 3500)
brute_force_knapsack(x = knapsack_objects[1:8,], W = 3500)
knapsack_dynamic(x = knapsack_objects[1:8,], W = 3500)

Comparing the times of the three functions:

system.time(greedy_knapsack(x = knapsack_objects[1:16,], W = 3500))
system.time(brute_force_knapsack(x = knapsack_objects[1:16,], W = 3500))
system.time(knapsack_dynamic(x = knapsack_objects[1:16,], W = 3500))


alede379/lab6 documentation built on May 21, 2019, 2:31 a.m.