knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(knapsack)

This report presents different implemented solutions to the knapsack problem. The knapsack problem is an optimization problem where one must fill a knapsack with items of different value v and weight w by choosing the proper combinations of items with the highest value while not exceeding the maximum allowed weight of the knapsack W.

In this exercise, we use the profvis profiling tool to identify botlenecks in the code and measure the performance of the implemented algorithms. The indicated times are in milliseconds. This profiling tools has a resolution of down to 10ms.

We first generate our data.

set.seed(42)
n <- 2000
knapsack_objects <- data.frame(
  w=sample(1:4000, size = n, replace = TRUE),
  v=runif(n = n, 0, 10000)
)

Brute force algorithm.

We try to solve all possible combinations of objects and weight and find the sum of values of each combination that is the closest to the target W.

Brute Force

The following code is ran and the profiling tool returns a runtime of 790ms. We compute the knapsack problem for the first 16 generated items.

knapsack::brute_force_knapsack(x = knapsack_objects[1:16, ], W = 3500)

The bottleneck identified is the combn() function passing the characters that make up the names of each row. It could be that character combination requires more memory to be handled. We have not been able to change it in order to shroten this time.

Brute Force parallelization

We implement parallelization in order to improve the performance of our brute force search. We run the following code and use the profiler to identify the bottlenecks.

knapsack::brute_force_knapsack(x = knapsack_objects[1:16, ], W = 3500, parallel = TRUE)

The total runtime is 1440ms. We find that the longest time taken is during the creation of the cluster at 750ms. We observe that the combination calculations are shrotened in the case of the character combination calculation at 220ms and the other combination calculations are worsened going from less than the resolution of 10ms to 220-250ms.

The increase in performance using prallelization can only be observed when the number of elements to combine significantly increases. The cluster creation is the major bottleneck compared to the non-parallelized version in that case.

Dynamic algorithm

We implement the dynamic approach to the problem that is presented in this article.

knapsack::knapsack_dynamic(x = knapsack_objects[1:500,], W = 3500)

We measure a runtime speed of 820 ms. The identified bottleneck is the appending of the matrices at 120ms and 360ms respectively. These are identified as being the most time consuming as we are accessing the data in memory each time we check the condition and when changing the value assigned in the matrix.

Greedy heuristic approach

In this approach, we do not look for the best solution to this optimization problem. This algorithm returns an answer that is at least 50\% of the true maximum value that can be put in the knapsack. This algorithm is faster because we do not compute all possible permutations of items, we only sort the values of items relative to their respective weight and then add them to the knapsack until we exceed the maximum weight value W.

knapsack::greedy_knapsack(x = knapsack_objects[1:1e+06,], W = 3500)

When we measure the runtime of our code for n = 1e+06, we obtain a total time of 700ms with the main bottleneck registering at 370ms when ordering the vector of ratios of value over weight in decreasing order.



senseiyukisan/732A94-Lab6 documentation built on Oct. 18, 2020, 5:39 a.m.