In faridmusayev/R_lab06: Knapsack Problem

Required packages

pandoc

rmarkdown

parallel

Knapsack Problem

Introduction The Knapsack Problem is a famous Dynamic Programming Problem that falls in the optimization category. It derives its name from a scenario where, given a set of items with specific weights and assigned values, the goal is to maximize the value in a knapsack while remaining within the weight constraint. Each item can only be selected once, as we don’t have multiple quantities of any item.

There are 3 different type of Knapsack Problems.

O-1 Knapsack Problem

The most common problem being solved is the 0-1 knapsack problem, which restricts the number xi of copies of each kind of item to zero or one. Given a set of n items numbered from 1 up to n, each with a weight and a value v, along with a maximum weight capacity W,

Bounded Knapsack Problem (BKP)

The bounded knapsack problem removes the item limit. It means, more that one item can be added to the knapsack.

Unbounded Knapsack Problem (UKP)

The unbound knapsack problme places no upper bound on the number of copies of each kind of item and can be formulated as above except for that the only restriction on x is that it is a non-negative integer.

1.1.2 Brute Force Knapsack

Brute force method is a simple and direct method to solve problems, which is often directly based on the description of the problem, so the brute force method is also the easiest method to apply.

The basic technology of brute force method is ergodic, that is to use a certain strategy to deal with all elements of the problem to be solved in order to find the solution of the problem. The brute force method is a typical exponential time algorithm because it needs to enumerate the elements to be processed in turn.

Answer : It takes 28 sec to run algorithm for n = 16 objects.

brute_force_knapsack(x = knapsack_objects[1:8,], W = 3500)

RNGversion(min(as.character(getRversion()),"3.5.3"))
set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion")
n <- 2000
knapsack_objects <-
data.frame(
w=sample(1:4000, size = n, replace = TRUE),
v=runif(n = n, 0, 10000)
)


brute_force_knapsack <- function(x,W, parallel = FALSE){

  if(!is.data.frame(x) || W < 0){
    stop("Wrong Input")
  }
  if(parallel == FALSE){
    list_comb <- list()
    for(j in 1:nrow(x))
    {
      list_comb[[j]] <- combn(rownames(x), j, paste, collapse = " ")
    }

    list_wght <- list()
    for(j in 1:nrow(x))
    {
      list_wght[[j]] <- combn(x$w, j,sum)
    }

    list_val <- list()
    for(j in 1:nrow(x) )
    {
      list_val[[j]] <- combn(x$v, j, sum)
    }

    vector_comb <- unlist(list_comb)
    vector_wght <- unlist(list_wght)
    vector_val <- round(unlist(list_val),0)

    weights_ind_cap <- which(vector_wght < W)
    value_vald <- vector_val[weights_ind_cap]
    maxvalid <- max(value_vald)

    valid_ind_vec <- which(vector_val == maxvalid)
    val_cmb <- vector_comb[valid_ind_vec]

    lead_comb_list <- list(value = maxvalid, elements = as.numeric(strsplit(val_cmb, " ")[[1]]))
  }else{

    no_of_cores <- detectCores() - 1
    cluster <- makeCluster(no_of_cores)
    clusterExport(cluster , c("x") ,envir = environment())

    list_comb <- parLapplyLB(cluster, 1:nrow(x), fun =  function(y) {
      combn(rownames(x) , y , paste0, collapse = " ")
    })
    list_wght <- parLapplyLB(cluster, 1:nrow(x), fun =  function(y) {
      combn(x$w , y, sum)
    })
    list_val <- parLapplyLB(cluster,1:nrow(x), fun =  function(y) {
      combn(x$v , y , sum)
    })

    stopCluster(cluster)

    vector_comb <- unlist(list_comb)
    vector_wght <- unlist(list_wght)
    vector_val <- round(unlist(list_val),0)

    weights_ind_cap <- which(vector_wght < W)
    value_vald <- vector_val[weights_ind_cap]
    maxvalid <- max(value_vald)

    valid_ind_vec <- which(vector_val == maxvalid)
    val_cmb <- vector_comb[valid_ind_vec]

    lead_comb_list <- list(value = maxvalid, elements = as.numeric(strsplit(val_cmb, " ")[[1]]))
  }
  return(lead_comb_list)
}





brute_force_knapsack(x = knapsack_objects[1:12,], W = 3500)

In addition, parallelism has been applied to brute_force_knapsack. This can be implemented by setting logical parameter parallel to TRUE/FALSE. Since the code has been written on the Windows machine, only socket (rather than forking) approach is available and it has been implemented with the help of parLapply functions.

1.1.3 Dynamic Knapsack

The basic idea of Knapsack dynamic programming is to use a table to store the solutions of solved sub-problems. If you face a subproblem again, you just need to take the solution in the table without having to solve it again. Therefore, the algorithms designed by dynamic programming are very effective.

Answer : It takes 0.57 sec to run algorithm for n = 500 objects.

knapsack_dynamic(x = knapsack_objects[1:800,], W = 3500)

RNGversion(min(as.character(getRversion()),"3.5.3"))
set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion")
n <- 2000
knapsack_objects <-
data.frame(
w=sample(1:4000, size = n, replace = TRUE),
v=runif(n = n, 0, 10000)
)


knapsack_dynamic<-function(x,W){

  if (length(colnames(x))!=2 || !is.numeric(W) || !all(colnames(x)==c("w","v")) || !is.data.frame(x) || !all(x[,'w']>0) || !all(x[,'w']>0) || W<0 )  stop("Wrong input")


  v = x[,'v']
  w = x[,'w']
  n=length(w)

  #create empty matrix of zeros

  kn<-matrix(0,n+1,W+1)

  #use tabulation method to compare weight of iterated item with previous row that does not contain this item

  for (i in 2:(n+1)){

    for (c in 2:(W+1)){

      if (w[i-1]>c){

        kn[i,c]=kn[i-1,c]

      }else{

        kn[i,c]=max( v[i-1]+kn[i-1,c-w[i-1]], kn[i-1,c] )
      }
    }
  }

  #go backwards to identify selected items

  elements<-c()

  nW<-W+1

  for (j in (n+1):2){

    if (kn[j,nW] != kn[j-1,nW]){

      elements<-c(elements,j-1)
      nW<-nW-w[j-1]

    }
  }

  answer<-list(value=round(kn[n+1,W+1],digits=0),elements=elements[order(elements)])

  return(answer)

}





knapsack_dynamic(x = knapsack_objects[1:800,], W = 3500)

1.1.4 Greedy Heuristic Knapsack

Greedy algorithms are like dynamic programming algorithms that are often used to solve optimal problems (find best solutions of the problem according to a particular criterion).

Answer : It takes 1.5 sec to run algorithm for n = 1000000 objects

greedy_knapsack(x = knapsack_objects[1:800,], W = 3500)

RNGversion(min(as.character(getRversion()),"3.5.3"))
set.seed(42, kind = "Mersenne-Twister", normal.kind = "Inversion")
n <- 2000
knapsack_objects <-
data.frame(
w=sample(1:4000, size = n, replace = TRUE),
v=runif(n = n, 0, 10000)
)

greedy_knapsack<-function(x,W){

  if (length(colnames(x))!=2 || !is.numeric(W) || !all(colnames(x)==c("w","v")) || !is.data.frame(x) || !all(x[,'w']>0) || !all(x[,'w']>0) || W<0 )  stop("Wrong input")


  w=x[,'w']
  v=x[,'v']

  #calculate value per weight  for each item

  ratio<-v/w

  dec_ratio<-ratio[order(ratio,decreasing=TRUE)]

  elements<-order(ratio,decreasing=TRUE)

  #put into bag using while loop starting from item with maximum value per weight

  bag<-c()

  i=1

  while (sum(bag)<=W){


    bag<-c(bag,w[elements[i]])

    i<-i+1

  }


  answer<-list(value=round(sum(v[elements[1:(i-2)]]), digits=0), elements = elements[1:(i-2)])

  return(answer)

}




greedy_knapsack(x = knapsack_objects[1:800,], W = 3500)