In JS2387/Group8knapsack: Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(Group8knapsack)

KNAPSACK PROBLEM

This package is designed to tackle the classical knapsack problem with 3 different approaches: Brute Force Method: brute_force_knapsack() Dynamic Programming: knapsack_dynamic() * Greedy heuristic: greedy_knapsack()

But before we get into the algorithms used, we've written a data generation function that creates a set of objects to be used for the knapsack problem. The sample data can be created by the user providing the total number of objects 'n'

test_knapsack <- knapsack_objects(2000)

This returns a dataset with 2 columns 'w' which represents the weights of each object and 'v' which represents the corresponding value of each knapsack object.

Brute Force Method

The first approach is the 'naive' approach or as we call it the brute force method. It is so called because it iterates over every possible combination of all available knapsack objects and finds the most optimum solution. The function to be called for this algorithm is displayed as follows:

test_knapsack <- knapsack_objects(2000)
brute_force_knapsack(test_knapsack[1:12,], 3500)

This method results in the computation of a problem whose complexity is of the order 2^n for n objects. Hence, the computation of the problem takes an incrementally longer amount of time as we keep increasing knapsack objects

start_time <- Sys.time()
invisible(brute_force_knapsack(x = test_knapsack[1:12,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

start_time <- Sys.time()
invisible(brute_force_knapsack(x = test_knapsack[1:16,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

Dynamic Programming

We next improve the speed of solving by applying dynamic programming. In this approach we iterate over every element and check if it should be included in the knapsack or not. We achieve this by filling a matrix of the dimensions - 'n' x 'W'; where 'n' is the total number of knapsack objects and 'W' is the knapsack capacity. This approach follows that the most optimum solution will be found [n,W] cell of the resultant matrix. The function can be called as follows:

test_knapsack <- knapsack_objects(2000)
knapsack_dynamic(test_knapsack[1:12,], 3500)

The order of complexity in this approach is O(nW) as the matrix used to derive the result is of the dimension n x W. Hence the loop runs at most 'nW' times.

While this solution is significantly faster than the brute force method, the time taken to solve this problem still increases significantly as we keep increasing the number of objects in the knapsack.

start_time <- Sys.time()
invisible(knapsack_dynamic(x = test_knapsack[1:12,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

start_time <- Sys.time()
invisible(knapsack_dynamic(x = test_knapsack[1:16,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

start_time <- Sys.time()
invisible(knapsack_dynamic(x = test_knapsack[1:500,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

Greedy Heuristic

The final algorithm used in solving this problem is the greedy method. In this method, we assume that the object in the knapsack can be split and included in the knapsack by parts. Therefore, this algorithm ranks all the objects in the decreasing order of value per unit of weight of each object and start filling the knapsack with the objects with higher yield. When it reached the upper limit of the knapsack capacity, it then takes a fraction of the final object and returns the resulting value and the list of chosen elements. The usage of the function is as follows:

test_knapsack <- knapsack_objects(2000000)
greedy_knapsack(test_knapsack[1:800,], 3500)

This particular code runs significantly faster and hance can be used to find a solution for much larger data sets in a limited amount of time.

start_time <- Sys.time()
invisible(greedy_knapsack(x = test_knapsack[1:800,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

start_time <- Sys.time()
invisible(greedy_knapsack(x = test_knapsack[1:1200,], W = 3500))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

start_time <- Sys.time()
invisible(greedy_knapsack(x = test_knapsack[1:1000000,], W = 350000))
end_time <- Sys.time()
print(difftime(end_time, start_time, units = "secs"))

While solving the greedy algorithm, we initially developed a better solution because even after filling the knapsack with objects ranked according to their benefit, we DID NOT break the loop immediately. Instead, we kept the loop going over all the objects to find any other object which may be able to fit into the knapsack. We were able to find such elements and include them into the knapsack. To demonstrate, we have created another function called greedy_knapsack_better()

test_knapsack <- knapsack_objects(2000)
greedy_knapsack(test_knapsack[1:800,], 3500)
greedy_knapsack_better(test_knapsack[1:800,], 3500)

As can be seen from the 2 results above, the better solution also takes the object #401 in the final solution!

PROFILING

In order to profile our code, we focused on the brute force method with 20 objects as a sample. We used the following method to profile our original brute_force_knapsack function

This led us to find that the bottleneck was the subset function that we were using while evaluating that particular combination. We modified the subsetting method as follows which led to a faster code

This change in the method used to subset led to a faster code with the following improvement:

suppressWarnings(RNGversion(min(as.character(getRversion()),"3.5.3")))
set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion")
n <- 2000
knapsack_objects <- data.frame(
  w=sample(1:4000, size = n, replace = TRUE),
  v=runif(n = n, 0, 10000)
)
x <- knapsack_objects[1:20,]
W = 3500

new_bfk <- function(x, W)  { 
  n <- length(x$v) #to count number of combinations we will evaluate
  value_sum = 0

  combinations_x <- lapply(1:(2^n-1), function(x) as.integer(intToBits(x)))
    for (i in 1:(2^n - 1)) {
      comb_elements <- which(combinations_x[[i]] > 0)
      sub_df <- x[as.integer(row.names(x)) %in% comb_elements, ]
      if (sum(sub_df$w) < W && sum(sub_df$v) > value_sum) {
        value_sum = sum(sub_df$v)
        value = sum(sub_df$v)
        elements = comb_elements
      }
  }
}

old_bfk <- function(x, W){
  n <- length(x$v) #to count number of combinations we will evaluate
  value_sum = 0

  combinations_x <- lapply(1:(2^n-1), function(x) as.integer(intToBits(x)))
    for (i in 1:(2^n - 1)) {
      comb_elements <- which(combinations_x[[i]] > 0)
      sub_df <- subset(x, as.integer(row.names(x)) %in% comb_elements)
      if (sum(sub_df$w) < W && sum(sub_df$v) > value_sum) {
        value_sum = sum(sub_df$v)
        value = sum(sub_df$v)
        elements = comb_elements
      }
  }
}

old_time <- system.time(old_bfk(x, W))
new_time <- system.time(new_bfk(x, W))


time_improvement <- old_time[3] - new_time[3]

time_improvement

PARALLELIZING

We attempted to parallelize the brute force algorithm and were successfully able to implement parallelization on the formation of all possible combinations of the knapsack elements. The results of implementing the two were as follows:

suppressWarnings(RNGversion(min(as.character(getRversion()),"3.5.3")))
set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion")
n <- 2000
knapsack_objects <- data.frame(
  w=sample(1:4000, size = n, replace = TRUE),
  v=runif(n = n, 0, 10000)
)
x <- knapsack_objects[1:20,]
W = 3500

par_bfk_time <- system.time(brute_force_knapsack(x, W, parallel = TRUE))
bfk_time <- system.time(brute_force_knapsack(x, W))
time_difference <- bfk_time - par_bfk_time
time_difference

The results were very interesting in the sense that even though the parallel computing was resulting in faster execution of the user instructions, but the system processes were taking longer and thus resulting in no significant improvement to the overall code speed.

JS2387/Group8knapsack documentation built on Dec. 18, 2021, 12:27 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

JS2387/Group8knapsack
Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic

In JS2387/Group8knapsack: Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic

KNAPSACK PROBLEM

Brute Force Method

Dynamic Programming

Greedy Heuristic

PROFILING

PARALLELIZING

R Package Documentation

Browse R Packages

We want your feedback!

JS2387/Group8knapsack Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic

In JS2387/Group8knapsack: Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic

KNAPSACK PROBLEM

Brute Force Method

Dynamic Programming

Greedy Heuristic

PROFILING

PARALLELIZING

R Package Documentation

Browse R Packages

We want your feedback!

JS2387/Group8knapsack
Solves the knapsack problem with 3 approaches - Brute Force, DP & Greedy Heuristic