require(microbenchmark)
require(KnapSack)
require(Rcpp)

The KnapSack problem is a well-known optimization problem. The idea is that we have knapsack that can handle a limited weight W and we want to fill the knapsack with different items. The items have different values for us. For example the toothbrush might be twice as important as the comb. To know which items to bring we make a packlist that contain the items, their weight and an assigned value which indicate how important this exact item is.

This package contain different functions to optimize the value in the knapsack for a given packlicklist. The functions all take a dataframe x with columns v (values) and w (weights) and a weight limit W.

Brute force search

The function brute_force_knapsack is the only solution that guaranties a correct answer for the knapsack problem. Brute force go through all possible combinations, $2^n$, and return the maximum value found.

Example of brute_force_knapsack:

set.seed(42)
n <- 2000
knapsack_objects <- data.frame(v=sample(1:4000,size=n,replace=TRUE),
                               w=runif(n=n,0,10000)
                               )
brute_force_knapsack(knapsack_objects[1:8,],3500)

How much time does it take to run brute_force_knapsack with $16$ objectss?

sys_time<-system.time(brute_force_knapsack(knapsack_objects[1:16,],3500))
sys_time

It takes about r sys_time[3] milliseconds to run brute_force_knapsack with $16$ objects.

Lineproof showed that the data frames takes a little bit longer to run. One way to optimize and speed up run is to minimize data by saving columns to vectors.

Dynamic programing

The function knapsack_dynamic uses dynamic programming which is an alternative when the weights are discreate. The function iterates over all possible values of W and checks what is the highest possible value for that weight.

Example of knapsack_dynamic:

set.seed(42)
n <- 2000
knapsack_objects <-
  data.frame(
    v=sample(1:4000,size=n,replace=TRUE),
    w=runif(n=n,0,10000)
  )


knapsack_dynamic(knapsack_objects[1:8,],3500)

How much time does it take to run knapsack_dynamic with $500$ objects?

sys_time<- system.time(knapsack_dynamic(knapsack_objects[1:500,],3500))
sys_time

It takes about r sys_time[3] milliseconds to run knapsack_dynamic with $500$ objects.

When the lineproof is run it turns out no tasks in the code take that long time to run. The function is already pretty fast so there is no need to optimize any further.

Greedy heuristic

Another approch is the greedy algortithm. It does not give an exact result but it will decrease the complexity in the algorithms. This algoritm calculates the value per weight and sorts the values in order according to this value. This means that the items that give mest value per weight will be brought.

set.seed(42)
n <- 1000000
knapsack_objects <-
  data.frame(
    v=sample(1:4000,size=n,replace=TRUE),
    w=runif(n=n,0,10000)
  )
system.time(
  greedy_knapsack(knapsack_objects,2000)

)

Time for greedy_knapsack() for $n = 1 000 000$

set.seed(42)
n <- 1000000
knapsack_objects <-
  data.frame(
    v=sample(1:4000,size=n,replace=TRUE),
    w=runif(n=n,0,10000)
  )
system.time(
  greedy_knapsack(knapsack_objects,2000)

)

Optimize greedy

In greedy_knapsack a for-loop is used. Since for-loops are very slow in R we try to increase speed by implementing the for-loop in C++ code instead. The c++ code is called by setting fast=TRUE in the function arguments.



ErikaAnderskar/KnapSack documentation built on May 20, 2019, 1:26 p.m.