knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(lab6) library(parallel) # package used for parallel computation
To install the lab5 package, source it from GitHub with the command devtools::install_github("Marbr987/lab6", build_vignettes = TRUE).
RNGversion(min(as.character(getRversion()),"3.5.3")) ##old sampler used for backward compatibility ## suppressWarnings() can be used so that the above warning is not displayed set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion") n <- 2000 knapsack_objects <- data.frame( w=sample(1:4000, size = n, replace = TRUE), v=runif(n = n, 0, 10000) )
The package includes three functions to solve the 0-1-knapsack-problem. All of the functions have an input x (a dataframe with columns w containing the weight of each element and v containing the value of the respective element) and W (the maximum weight the knapsack can carry).
Furthermore, all the three functions return a list containing the maximum value (or a "good" solution like the greedy_knapsack function) under the given weight limit and the elements of this solution (see example below).
brute_force_knapsack(x = knapsack_objects[1:8,], W = 3500) knapsack_dynamic(x = knapsack_objects[1:8,], W = 3500) greedy_knapsack(x = knapsack_objects[1:8,], W = 3500)
For profiling the code the profvis package was used. After loading the package library(profvis) the function which should be analysed was loaded with source(my_function.R) and profiled with profvis(my_function(some_input)). The interactive plot displays the the computation time needed for each line of code in the my_function.R file.
The first function is called brute_force_knapsack and solves the problem by testing every possible combination of elements in the knapsack and calculating their weight. The initial computation time for n = 16 objects was 15 seconds.
st <- system.time(brute_force_knapsack(x = knapsack_objects[1:16,], W = 2000)) print(paste0("time needed: ", toString(round(as.numeric(st)[1], 2)), "s"))
Profiling the code with profvis showed that the initially used for-loop is by far the biggest computational bottleneck. Therefore, the for-loop was changed from
for (i in counter) { temp_df <- combinations[[i]] * x # takes especially long values[i] <- sum(temp_df$v) weights[i] <- sum(temp_df$w) }
to a parallel computation:
temp_dfs <- parallel::mclapply(combinations, function(y){y * x}) values <- sapply(temp_dfs, function(x){sum(x$v)}) weights <- sapply(temp_dfs, function(x){sum(x$w)})
The computation time needed for 16 objects with parallel computation is 0.5 seconds resulting in a performance gain of 14.5 seconds.
st <- system.time(brute_force_knapsack(x = knapsack_objects[1:16,], W = 2000, parallel=TRUE)) print(paste0("time needed: ", toString(round(as.numeric(st)[1], 2)), "s"))
The second function is called knapsack_dynamic and solves the problem by iterating over all possible values of w. The initial computation time for n = 500 objects was 2.5 seconds.
st <- system.time(knapsack_dynamic(x = knapsack_objects[1:500,], W = 2000)) print(paste0("time needed: ", toString(round(as.numeric(st)[1], 2)), "s"))
Profiling the code with profvis showed that the most computation time was needed for the for-loop inside the function. The code was sped up changing the indexing of the input dataframe
for (i in 1:nrow(x)) { for (j in 0:W){ if (x$w[i] > j) { m[i+1, j+1] <- m[i, j+1] } else { m[i+1, j+1] <- max(m[i, j+1], m[i, j-x$w[i]+1] + x$v[i]) # takes aspecially long } } }
to indexing of vectors which resulted in a performance gain of 1.3 seconds.
w <- x$w v <- x$v for (i in 1:nrow(x)) { for (j in 0:W){ if (w[i] > j) { m[i+1, j+1] <- m[i, j+1] } else { m[i+1, j+1] <- max(m[i, j+1], m[i, j-w[i]+1] + v[i]) } } }
The third and last function is called greedy_knapsack and does not always find the optimal solution to the problem but reduces the computational complexity considerably. The initial computation time for n = 1000000 objects was 0.85 seconds.
st <- system.time(greedy_knapsack(x = knapsack_objects[1:1000000,], W = 2000)) print(paste0("time needed: ", toString(round(as.numeric(st)[1], 2)), "s"))
Profiling the greedy_knapsack function with profvis showed that the the bottleneck is the order() function. This function is already very fast and therefore needs no enhancement.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.