smartPARE: An R package for efficient identification of true mRNA cleavage sites

Documented in kerasCreateDataset_2d kerasCreateModelCNN_2d runCLR smartPARE_examineCleavages smartPARE_ListTrue smartPARE_parse smartPARE_train tuneCLR

#' Function to import data to train the model
#'
#' Make sure to have a dir called train and your picures in dirs in this dir.
#' The pictures must be sorted so that a category of pics is in each subdir of the train dir.
#' Ex, all cat pics in a sub dir called cats and all dog pics in a subdir called dogs.
#' @param homePath Path to your working directory
#' @param pixels Desired pixel size of the images
#' @keywords kerasCreateDataset
#' @export
#' @examples
#' kerasCreateDataset_2d(homePath = "keras/3categories/", pixels = 28)

kerasCreateDataset_2d <- function(
  homePath = "example/",
  pixels = 28){
  memory.limit(size=100000)

  # path to image folders
  train_image_files_path <- paste0(homePath,"train/")

  testPics <- list()
  #load pictures, make sure they are in your homepath dir train or any recursive dir to that one
  #no other files are allowed recursively in your train folder
  testFiles <- list.files(train_image_files_path, full.names = T, recursive = T)
  for (i in 1:length(testFiles)){testPics[[i]] <- EBImage::readImage(files = testFiles[[i]])}
  #resize pictures
  for (i in 1:length(testFiles)){testPics[[i]] <-  EBImage::resize(testPics[[i]],pixels,pixels)}
  testPics <- EBImage::combine(testPics)
  #Transpose your pics
  testPics <- aperm(testPics,c(4,1,2,3)) #(28, 28, 3)

  #define width and height
  img_width <- pixels
  img_height <- pixels
  target_size <- c(img_width, img_height)

  # list different category dirs your figures are arranged in
  classes1 <- list.dirs(train_image_files_path,full.names = F)
  classes2 <- classes1[2:length(classes1)]

  # number of output classes
  output_n <- length(classes2)

  # optional data augmentation
  train_data_gen = keras::image_data_generator(
    rescale = 1/255
  )

  # training images generated array
  train_image_array_gen <- keras::flow_images_from_directory(train_image_files_path,
                                                      train_data_gen,
                                                      target_size = target_size,
                                                      class_mode = "categorical",
                                                      classes = classes2,
                                                      seed = 42)
  #summary of your generated array
  tabSum <- table(factor(train_image_array_gen$classes))
  y_train <-NULL
  for (i in 1:length(tabSum)){y_train <- c(y_train,rep(as.numeric(names(tabSum)[i]),tabSum[i]))}

  ## Number of images per class:
  print("pics/class in alphabetic order")
  print(tabSum)

  cat("\nClass label vs index mapping:\n")

  ## Class label vs index mapping:
  trainLabels <- keras::to_categorical(y_train,output_n) #output_n no of classes

  randomOrder <- sample(1:nrow(trainLabels))
  trainLabels <- trainLabels[randomOrder,]
  testPics <- testPics[randomOrder, , , ]

  #assign the data for the next function
  assign("testPics", testPics, envir = .GlobalEnv)
  assign("trainLabels", trainLabels, envir = .GlobalEnv)
  assign("output_n", output_n, envir = .GlobalEnv)
}


#' Function used to create a 2d CNN model
#'
#' The model is used to train a Convolutional neural network model that can recognize images
#' @param testPics Image info object generated by kerasCreateDataset_2d
#' @param trainLabels trainLabels object generated by
#' @param epochs Number of epochs in keras
#' @param batch_size Batch size in keras
#' @param dropout Keras dropout
#' @param validation_split Fraction of the dataset that will be used for validation
#' @param pixels Pixel size of the images, assigned in kerasCreateDataset_2d
#' @param optimizer Optimizer
#' @param convolutionalLoop Number of times to iterate the convolutional layers loop
#' @param denseLoop Number of times to iterate the dense layers loops
#' @param NO_pooling Number of times to perform pooling in the convolutional layers loop (max = 2)
#' @param patience EarlyStopping patience
#' @param activation Standard function activation
#' @param activationFinal Final activation function of the model
#' @keywords kerasCreateModelCNN
#' @export
#' @import magrittr
#' @examples
#' kerasCreateModelCNN_2d(testPics,
#' trainLabels,
#' denseLoop = 3,
#' epochs,
#' batch_size,
#' dropout = 0.2,
#' patience = 0,
#' validation_split,
#' activation = "relu",
#' activationFinal = "softmax",
#' pixels =28,
#' optimizer = "adam",
#' convolutionalLoop = 2,
#' NO_pooling = 1
#' )


kerasCreateModelCNN_2d <- function(denseLoop = 3,
                                   testPics,
                                   trainLabels,
                                   epochs,
                                   batch_size,
                                   dropout = 0.2,
                                   patience = 0,
                                   validation_split,
                                   activation = "relu",
                                   activationFinal = "softmax",
                                   pixels =28,
                                   optimizer = "adam",
                                   convolutionalLoop = 2,
                                   NO_pooling = 1


){


  #Useful to avoid clutter from old models / layers.
  keras::k_clear_session()
  ###Create a Model, the parts of the model are quite self-explanatory
  model <- keras::keras_model_sequential()
  model %>%
    keras::layer_conv_2d(filters = 32, kernel_size = c(3,3),
                  activation = activation,
                  padding = "same",
                  input_shape = c( pixels,pixels,3),
                  kernel_regularizer=keras::regularizer_l1(1e-4)) %>% #batch_size, c(3,3)
    keras::layer_spatial_dropout_2d(dropout) %>%
    keras::layer_batch_normalization()

  for(x in (1:convolutionalLoop)){
    depthX <- x
    units1 <- 32*2^(depthX)
    print(units1)
    model %>%
      keras::layer_conv_2d(filters = units1,
                    kernel_size = c(3,3),
                    padding = "same",
                    activation = activation,
                    kernel_regularizer=keras::regularizer_l1(1e-4))

    if(NO_pooling > 0){
      model %>%
        keras::layer_max_pooling_2d(pool_size = c(2,2))
      NO_pooling <- NO_pooling - 1
    }
    model %>%
      keras::layer_spatial_dropout_2d(dropout) %>%
      keras::layer_batch_normalization()
  }


  model %>%
    keras::layer_flatten()%>%
    keras::layer_dropout(dropout)

  for(x in (1:denseLoop)){
    depthX <- denseLoop + 1 - x
    units1 <- 64*2^(depthX)
    print(units1)
    model %>%
      keras::layer_dense(units = units1,
                  activation = activation,
                  kernel_regularizer=keras::regularizer_l2(1e-4)) %>% #l2 dense data, l1 sparse
      keras::layer_dropout(dropout)
  }
  model %>%
    keras::layer_dense(units = output_n,
                activation = activationFinal,
    )

  summary(model)

  # Compile  # For a multi-class classification problem
  model %>%
    generics::compile(loss = 'categorical_crossentropy',
            optimizer = optimizer, #"adam"'rmsprop',#optimizer_rmsprop()  optimizer_sgd(lr = 0.01)
            metrics = 'accuracy')
  return(model)
}



#' Uses the trained model to assess if the images are true or false cleavages
#'
#'
#' @param examinePath Path to the cleavage images (windows)
#' @param model Trained model
#' @param model Trained model
#' @param hitsNrs Directory number of the true cleavages. Default c(1,2)
#' @param falseNrs Directory number of the false cleavages. Default 0
#' @param pixels Pixel size of the images, same as assigned in smartPARE_train
#' @param picsPerloop Number of images per loop withing the function. Default 100
#' @param avoidDir directory that should not be assessed within the recursive examinePath
#' @keywords smartPARE_examineCleavages
#' @export
#' @import magrittr
#' @examples
#' smartPARE_examineCleavages(examinePath,
#' model,
#' hitsNrs = c(1,2),
#' falseNrs = c(0),
#' pixels = 28,
#' picsPerloop = 100,
#' avoidDir = vector(),
#' reshape1 =F
#' )
#'
smartPARE_examineCleavages <- function(examinePath,
                             model,
                             hitsNrs = c(1,2),
                             falseNrs = c(0),
                             pixels = 28,
                             picsPerloop = 100,
                             avoidDir = vector(),
                             reshape1 =F
){
  ###########################Explanation#####################
  #Function used to validate your non-training dataset

  printP <- function(...){
    ###########################Explanation#####################
    #Function used paste and print strings

    print(paste(...))
  }

  whichIsNotNaMatch <- function(arg1,arg2){
    return(which(!is.na(match(as.character(arg1), as.character(arg2)))))
  }
  memory.limit(size=100000)

  percentageOfProcess <- function(perc=100,y,totalY){
    ###########################Explanation#####################
    #Progress function

    if(y%%(ceiling(totalY/perc))==0){
      print(round(y/totalY*perc))
    }
  }

  #Path(s) to your dirs with pictures to examine
  allDirs <- list.dirs(examinePath)

  xVec <- vector()
  keep1 <- vector()

  #Create output dirs with your predictions, default is true and false
  for( x in (seq_along(allDirs))){
    if(length(grep(pattern = "true", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(grep(pattern = "false", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(avoidDir) > 0)
      for(pat in(avoidDir)){
        if(length(grep(pattern = pat, x = allDirs[x])) > 0){
          xVec <- c(xVec,x)
        }
      }

  }
  if(length(xVec) > 0){
    allDirs <- allDirs[-xVec]
  }

  #Removing excessive dirs
  for( x in (seq_along(allDirs))){
    if( length(grep(pattern = allDirs[x],x = allDirs[-x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(grep(pattern = "train", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(grep(pattern = "valid", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(grep(pattern = "true", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else if(length(grep(pattern = "false", x = allDirs[x])) > 0){
      xVec <- c(xVec,x)
      printP("taking away", allDirs[x])
    }else{
      keep1 <- c(keep1, allDirs[x])
    }
  }
  allDirs <- keep1

  #assigning how many pictures should be analyzed every lap of the loop
  picsPerloopOrig <- picsPerloop
  #Loop where the picures are validated
  for(examinePath1 in(allDirs)){
    print(examinePath1)
    examineCleavageFilesOrig <- list.files(paste0(examinePath1), pattern = ".jpg",include.dirs = F, full.names = T, recursive = F)
    examineCleavageFiles_noPathOrig <- list.files(paste0(examinePath1),pattern = ".jpg", include.dirs = F, full.names = F, recursive = F)

    #Remove old predictions if run before on same data
    hitsPath <- paste0(examinePath1,"/true/")
    unlink(paste0(hitsPath,"*"), recursive = T)
    falsePath <- paste0(examinePath1,"/false/")
    unlink(paste0(falsePath,"*"), recursive = T)

    if(length(examineCleavageFilesOrig) < picsPerloop){
      picsPerloop <- length(examineCleavageFilesOrig)
    }

    picNrMax <- picsPerloop

    while (picNrMax <= length(examineCleavageFilesOrig)) {
      printP(picNrMax,"/",length(examineCleavageFilesOrig))
      examineCleavageFiles <- examineCleavageFilesOrig[picNrMax-picsPerloop+1:picsPerloop]
      examineCleavageFiles_noPath <- examineCleavageFiles_noPathOrig[picNrMax-picsPerloop+1:picsPerloop]

      examinePics <- list()
      print("reading examine pics")
      for (i in 1:length(examineCleavageFiles)){
        percentageOfProcess(y = i,totalY = length(examineCleavageFiles),perc = 10)
        #y <- 309
        examinePics[[i]] <- EBImage::readImage(files = examineCleavageFiles[[i]])}
      print("resizing")
      for (i in 1:length(examineCleavageFiles)){
        percentageOfProcess(y = i,totalY = length(examineCleavageFiles),perc = 10)
        examinePics[[i]] <-  EBImage::resize(examinePics[[i]],pixels,pixels)}
      if(reshape1 == T){
        print("reshaping")
        for (i in 1:length(examineCleavageFiles)) {
          percentageOfProcess(y = i,totalY = length(examineCleavageFiles),perc = 10)
          examinePics[[i]] <- keras::array_reshape(examinePics[[i]],c(pixels,pixels,3))} #,3
      }
      examinePics <- EBImage::combine(examinePics)
      examinePics <- aperm(examinePics,c(4,1,2,3)) #(28, 28, 3)
      ex_valid <- examinePics
      if(reshape1 == T){
        print("constructing validation vec")
        ex_valid <- NULL
        for (i in 1:length(examinePics)){
          percentageOfProcess(y = i,totalY = length(examineCleavageFiles),perc = 10)
          ex_valid <- rbind(ex_valid, examinePics[[i]])} # first 8 planes
      }
      print("predicting")

      #assigning validated pics into the right categorial dir
      predValid_ex <- model %>% keras::predict_classes(ex_valid)
      hitsEx <- whichIsNotNaMatch(predValid_ex, hitsNrs)
      falseEx <- whichIsNotNaMatch(predValid_ex, falseNrs)
      trueCleavages <- examineCleavageFiles_noPath[hitsEx]
      dir.create(hitsPath, showWarnings = F)
      file.copy(examineCleavageFiles[hitsEx], paste0(hitsPath,trueCleavages))
      falseCleavages <- examineCleavageFiles_noPath[falseEx]
      dir.create(falsePath, showWarnings = F)
      file.copy(examineCleavageFiles[falseEx], paste0(falsePath,falseCleavages))

      #Keeping track of how many pictures left to analyze
      if(picNrMax == length(examineCleavageFilesOrig)){
        picNrMax <- picNrMax + picsPerloop
      }else if(picNrMax + picsPerloop > length(examineCleavageFilesOrig)){
        picsPerloop <- length(examineCleavageFilesOrig) -picNrMax
        picNrMax <- picNrMax + picsPerloop
      } else{
        picNrMax <- picNrMax + picsPerloop
      }
    }
    picsPerloop <- picsPerloopOrig
  }
}

#' Tuning the cyclical learning rate (CLR)
#'
#' The function is run to find the minimum and maximum values of the cyclical learning rate
#' @param batch_size2 Batch size
#' @param epochs_find_LR Epochs to run to find the optimal learnig rate, often a low number is needed
#' @param lr_max Maximum approved learning rate
#' @param optimizer2 Optimizer
#' @param validation_split2 Fraction of the data used to validate the model
#' @param rollmeanSplit smothens the curve by rollmean division of this number
#' @keywords tuneCLR
#' @export
#' @import magrittr
#' @examples
#' tuneCLR(batch_size2 = 64,
#' epochs_find_LR = 20,
#' lr_max = 0.1,
#' optimizer2 = optimizer_rmsprop(lr=lr_max, decay=0),
#' validation_split2 =0.2,
#' rollmeanSplit = 3
#' )

tuneCLR <- function(batch_size2 = 64,
                    epochs_find_LR = 20,
                    lr_max = 0.1,
                    optimizer2 = keras::optimizer_rmsprop(lr=lr_max, decay=0),
                    validation_split2 =0.2,
                    rollmeanSplit = 3
                    )
  {
  #adopted from http://thecooldata.com/2019/01/learning-rate-finder-with-cifar10-keras-r/
  Cyclic_LR <- function(iteration=1:32000,
                        base_lr=1e-5,
                        max_lr=1e-3,
                        step_size=2000,
                        mode='triangular',
                        gamma=1,
                        scale_fn=NULL,
                        scale_mode='cycle'){ # translated from python to R, original at: https://github.com/bckenstler/CLR/blob/master/clr_callback.py # This callback implements a cyclical learning rate policy (CLR). # The method cycles the learning rate between two boundaries with # some constant frequency, as detailed in this paper (https://arxiv.org/abs/1506.01186). # The amplitude of the cycle can be scaled on a per-iteration or per-cycle basis. # This class has three built-in policies, as put forth in the paper. # - "triangular": A basic triangular cycle w/ no amplitude scaling. # - "triangular2": A basic triangular cycle that scales initial amplitude by half each cycle. # - "exp_range": A cycle that scales initial amplitude by gamma**(cycle iterations) at each cycle iteration. # - "sinus": A sinusoidal form cycle # # Example # > clr <- Cyclic_LR(base_lr=0.001, max_lr=0.006, step_size=2000, mode='triangular', num_iterations=20000) # > plot(clr, cex=0.2)

    # Class also supports custom scaling functions with function output max value of 1:
    # > clr_fn <- function(x) 1/x # > clr <- Cyclic_LR(base_lr=0.001, max_lr=0.006, step_size=400, # scale_fn=clr_fn, scale_mode='cycle', num_iterations=20000) # > plot(clr, cex=0.2)

    # # Arguments
    #   iteration:
    #       if is a number:
    #           id of the iteration where: max iteration = epochs * (samples/batch)
    #       if "iteration" is a vector i.e.: iteration=1:10000:
    #           returns the whole sequence of lr as a vector
    #   base_lr: initial learning rate which is the
    #       lower boundary in the cycle.
    #   max_lr: upper boundary in the cycle. Functionally,
    #       it defines the cycle amplitude (max_lr - base_lr).
    #       The lr at any cycle is the sum of base_lr
    #       and some scaling of the amplitude; therefore
    #       max_lr may not actually be reached depending on
    #       scaling function.
    #   step_size: number of training iterations per
    #       half cycle. Authors suggest setting step_size
    #       2-8 x training iterations in epoch.
    #   mode: one of {triangular, triangular2, exp_range, sinus}.
    #       Default 'triangular'.
    #       Values correspond to policies detailed above.
    #       If scale_fn is not None, this argument is ignored.
    #   gamma: constant in 'exp_range' scaling function:
    #       gamma**(cycle iterations)
    #   scale_fn: Custom scaling policy defined by a single
    #       argument lambda function, where
    #       0 <= scale_fn(x) <= 1 for all x >= 0.
    #       mode paramater is ignored
    #   scale_mode: {'cycle', 'iterations'}.
    #       Defines whether scale_fn is evaluated on
    #       cycle number or cycle iterations (training
    #       iterations since start of cycle). Default is 'cycle'.

    ########
    if(is.null(scale_fn)==TRUE){
      if(mode=='triangular'){scale_fn <- function(x) 1; scale_mode <- 'cycle';}
      if(mode=='triangular2'){scale_fn <- function(x) 1/(2^(x-1)); scale_mode <- 'cycle';}
      if(mode=='exp_range'){scale_fn <- function(x) gamma^(x); scale_mode <- 'iterations';}
      if(mode=='sinus'){scale_fn <- function(x) 0.5*(1+sin(x*pi/2)); scale_mode <- 'cycle';}
    }
    lr <- list()
    if(is.vector(iteration)==TRUE){
      for(iter in iteration){
        cycle <- floor(1 + (iter / (2*step_size)))
        x2 <- abs(iter/step_size-2 * cycle+1)
        if(scale_mode=='cycle') x <- cycle
        if(scale_mode=='iterations') x <- iter
        lr[[iter]] <- base_lr + (max_lr-base_lr) * max(0,(1-x2)) * scale_fn(x)
      }
    }
    lr <- do.call("rbind",lr)
    return(as.vector(lr))
  }

  LogMetrics <- R6::R6Class("LogMetrics",
                            inherit = keras::KerasCallback,
                            public = list(
                              loss = NULL,
                              acc = NULL,
                              val_accuracy = NULL,
                              val_loss = NULL,
                              on_batch_end = function(batch, logs=list()) {
                                self$loss <- c(self$loss, logs[["loss"]])
                                self$acc <- c(self$acc, logs[["accuracy"]])
                                self$val_loss <- c(self$acc, logs[["val_loss"]])
                                self$val_accuracy <- c(self$acc, logs[["val_accuracy"]])
                              }
                            ))

  callback_lr_init <- function(logs){
    iter <<- 0
    lr_hist <<- c()
    iter_hist <<- c()
  }
  callback_lr_set <- function(batch, logs){
    iter <<- iter + 1
    LR <- l_rate[iter] # if number of iterations > l_rate values, make LR constant to last value
    if(is.na(LR)) LR <- l_rate[length(l_rate)]
    keras::k_set_value(model$optimizer$lr, LR)
  }
  callback_lr_log <- function(batch, logs){
    lr_hist <<- c(lr_hist, keras::k_get_value(model$optimizer$lr))
    iter_hist <<- c(iter_hist, keras::k_get_value(model$optimizer$iterations))
  }



  signif.ceiling <- function(x, n){
    pow <- floor( log10( abs(x) ) ) + 1 - n
    y <- ceiling(x / 10 ^ pow) * 10^pow
    # handle the x = 0 case
    y[x==0] <- 0
    y
  }

  signif.floor <- function(x, n){
    pow <- floor( log10( abs(x) ) ) + 1 - n
    y <- floor(x / 10 ^ pow) * 10^pow
    # handle the x = 0 case
    y[x==0] <- 0
    y
  }

  findSteepestSlopes <- function(z = accDf$acc, posNeg = "pos"){
    if(posNeg== "pos"){
      return(z[which(abs(diff(z))==max(diff(z)))])
    }else if(posNeg== "neg"){
      return(z[which(abs(diff(z))==min(diff(z)))])
    }else if(posNeg== "abs"){
      return(z[which(abs(diff(z))==max(abs(diff(z))) )])
    }
  }

  localMaxima <- function(x) {
    # Use -Inf instead if x is numeric (non-integer)
    y <- diff(c(-.Machine$integer.max, x)) > 0L
    rle(y)$lengths
    y <- cumsum(rle(y)$lengths)
    y <- y[seq.int(1L, length(y), 2L)]
    if (x[[1]] == x[[2]]) {
      y <- y[-1]
    }
    y
  }

  inflect <- function(x, threshold = 1){ #threshold - how big a local maxima or minima
    up   <- sapply(1:threshold, function(n) c(x[-(seq(n))], rep(NA, n)))
    down <-  sapply(-1:-threshold, function(n) c(rep(NA,abs(n)), x[-seq(length(x), length(x) - abs(n) + 1)]))
    a    <- cbind(x,up,down)
    list(minima = which(apply(a, 1, min) == a[,1]), maxima = which(apply(a, 1, max) == a[,1]))
  }

  betweenVector <- function(vector = nodes1,left = rangesTravList[,2],right = rangesTravList[,3]){
    return(apply(sapply(vector, data.table::between,left ,right),2,which))
  }

  callback_lr <- keras::callback_lambda(on_train_begin=callback_lr_init, on_batch_begin=callback_lr_set)
  callback_logger <- keras::callback_lambda(on_batch_end=callback_lr_log)


  n_iter <- ceiling(epochs_find_LR * (NROW(testPics)/batch_size2))
  growth_constant <- 15

  if(n_iter < epochs_find_LR){
    n_iter <- epochs_find_LR
  }
  # our learner will be an exponential function:
  l_rate <- exp(seq(0, growth_constant, length.out=n_iter))
  l_rate <- l_rate/max(l_rate)
  l_rate <- l_rate * lr_max
  plot(l_rate, type="b", pch=16, cex=0.1, xlab="iteration", ylab="learning rate")
  plot(l_rate, type="b", log="y",pch=16, cex=0.1, xlab="iteration", ylab="learning rate (log scale)")


  #New from here
  # batch_size2 <- 2
  # epochs_find_LR <- 10
  callback_log_acc_lr <- LogMetrics$new()
  model <- kerasCreateModelCNN_2d(optimizer = optimizer2)
  history <- model %>% generics::fit(
    testPics,
    trainLabels,
    batch_size = batch_size2,
    epochs = epochs_find_LR,
    shuffle = TRUE,
    callbacks = list(callback_lr, callback_logger, callback_log_acc_lr),
    verbose = 2,
    validation_split=validation_split2)


  plot(lr_hist,
       callback_log_acc_lr$acc,
       log="x", type="l", pch=16, cex=0.3,
       xlab="learning rate", ylab="accuracy: rollmean(100)")

  #rollmeanSplit <- 10
  rm2 <- round(length(lr_hist)/rollmeanSplit)

  lrRM <- zoo::rollmean(lr_hist, rm2)
  accRM <- zoo::rollmean(callback_log_acc_lr$acc, rm2)
  accDforig <- as.data.frame(cbind(lr = lr_hist, acc= callback_log_acc_lr$acc))
  accDf <- as.data.frame(cbind(lr = lrRM, acc= accRM))

  plot(accDf$lr,
       accDf$acc,
       log="x", type="l", pch=16, cex=0.3,
       xlab="learning rate", ylab="accuracy: rollmean(100)")

  threshold1 <- 3
  bottoms <- lapply(1:threshold1, function(x) inflect(accDf$acc, threshold = x)$minima)
  tops <- lapply(1:threshold1, function(x) inflect(accDf$acc, threshold = x)$maxima)

  #2 - minima or maxima with threshold 2 is good
  bottoms1 <- bottoms[[2]]
  tops1 <- tops[[2]]

  minMax <- suppressWarnings(sapply(seq_along(bottoms1), function(x){
    top1 <- tops1[max(which(data.table::between(x = tops1,lower = bottoms1[x], upper =bottoms1[x+1])))]
    cbind(bottoms1[x],top1)
  }))
  for(x in(1:length(minMax[1,]))){
    if(is.na(minMax[1,x])){
      minMax[1,x] <- 1
    }
  }

  for(x in(1:length(minMax[1,]))){
    if(is.na(minMax[2,x])){
      minMax[2,x] <- length(accDf$acc)
    }
  }
  # colSums(minMax)

  # longest slope method
  diffCols <- minMax[2,] - minMax[1,]
  longestSlope <- which(diffCols == max(diffCols,na.rm = T))
  bords <- minMax[,longestSlope]
  Learning_rate_h <- signif.floor(accDf$lr[bords][2],n=1)
  Learning_rate_l <- signif.ceiling(accDf$lr[bords][1],n=1)

  abline(v=Learning_rate_l, col="blue")
  abline(v=Learning_rate_h, col="red")

  assign("Learning_rate_l", Learning_rate_l, envir = .GlobalEnv)
  assign("Learning_rate_h", Learning_rate_h, envir = .GlobalEnv)
  assign("accDforig", accDforig, envir = .GlobalEnv)
  print("if the steepest positive slope is not the largest calculation, please assign Learning_rate_h or Learning_rate_l yourself.")
}


#' Optimizes the cyclical learning rate (CLR) based convolutional neural network (CNN) model
#'
#' The function optimizes the cyclical learning rate (CLR) based convolutional neural network (CNN) model based on the hyperparameters assigned
#' @param optimizer2 Optimizer
#' @param denseLoop2 Number of times to iterate the dense layers loops
#' @param trainLabels2 trainLabels object generated by
#' @param epochs2 Number of epochs in keras
#' @param batch_size2 Batch size in keras
#' @param dropout2 Keras dropout
#' @param patience2 EarlyStopping patience
#' @param validation_split2 Fraction of the dataset that will be used for validation
#' @param convolutionalLoop2 Number of times to iterate the convolutional layers loop
#' @param NO_pooling2 Number of times to perform pooling in the convolutional layers loop (max = 2)
#' @param testPics2 Image info object generated by kerasCreateDataset_2d
#' @param Learning_rate_l2 Minimum CLR assigned from tuneCLR
#' @param Learning_rate_h2 Maximum CLR assigned from tuneCLR
#' @param pathOut path where the generated model(s) will be saved
#' @keywords runCLR
#' @export
#' @import magrittr
#' @examples
#' runCLR(optimizer2 = optimizer_sgd(lr=lr_max, decay=0),
#' denseLoop2,
#' trainLabels2 = trainLabels,
#' epochs2 = 5,
#' batch_size2,
#' dropout2 = 0.2,
#' patience2 = 30,
#' validation_split2 = 0.2,
#' convolutionalLoop2 ,
#' NO_pooling2 = 1,
#' testPics2 = testPics,
#' Learning_rate_l2 = Learning_rate_l,
#' Learning_rate_h2 = Learning_rate_h,
#' pathOut = "keras/3categories/bayesmodels/"
#' )



runCLR <- function(optimizer2 = keras::optimizer_sgd(lr=lr_max1, decay=0),
                   denseLoop2,
                   trainLabels2 = trainLabels,
                   epochs2 = 5,
                   batch_size2,
                   dropout2 = 0.2,
                   patience2 = 30,
                   validation_split2 = 0.2,
                   convolutionalLoop2 ,
                   NO_pooling2 = 1,
                   testPics2 = testPics,
                   Learning_rate_l2 = Learning_rate_l,
                   Learning_rate_h2 = Learning_rate_h,
                   pathOut = paste0(homePath1, "/bayesmodels/")
                   ){
  #adopted from http://thecooldata.com/2019/01/learning-rate-finder-with-cifar10-keras-r/
  ###########################Explanation#####################
  #Function used to run CLR
  Cyclic_LR <- function(iteration=1:32000,
                        base_lr=1e-5,
                        max_lr=1e-3,
                        step_size=2000,
                        mode='triangular',
                        gamma=1,
                        scale_fn=NULL,
                        scale_mode='cycle'){ # translated from python to R, original at: https://github.com/bckenstler/CLR/blob/master/clr_callback.py # This callback implements a cyclical learning rate policy (CLR). # The method cycles the learning rate between two boundaries with # some constant frequency, as detailed in this paper (https://arxiv.org/abs/1506.01186). # The amplitude of the cycle can be scaled on a per-iteration or per-cycle basis. # This class has three built-in policies, as put forth in the paper. # - "triangular": A basic triangular cycle w/ no amplitude scaling. # - "triangular2": A basic triangular cycle that scales initial amplitude by half each cycle. # - "exp_range": A cycle that scales initial amplitude by gamma**(cycle iterations) at each cycle iteration. # - "sinus": A sinusoidal form cycle # # Example # > clr <- Cyclic_LR(base_lr=0.001, max_lr=0.006, step_size=2000, mode='triangular', num_iterations=20000) # > plot(clr, cex=0.2)

    # Class also supports custom scaling functions with function output max value of 1:
    # > clr_fn <- function(x) 1/x # > clr <- Cyclic_LR(base_lr=0.001, max_lr=0.006, step_size=400, # scale_fn=clr_fn, scale_mode='cycle', num_iterations=20000) # > plot(clr, cex=0.2)

    # # Arguments
    #   iteration:
    #       if is a number:
    #           id of the iteration where: max iteration = epochs * (samples/batch)
    #       if "iteration" is a vector i.e.: iteration=1:10000:
    #           returns the whole sequence of lr as a vector
    #   base_lr: initial learning rate which is the
    #       lower boundary in the cycle.
    #   max_lr: upper boundary in the cycle. Functionally,
    #       it defines the cycle amplitude (max_lr - base_lr).
    #       The lr at any cycle is the sum of base_lr
    #       and some scaling of the amplitude; therefore
    #       max_lr may not actually be reached depending on
    #       scaling function.
    #   step_size: number of training iterations per
    #       half cycle. Authors suggest setting step_size
    #       2-8 x training iterations in epoch.
    #   mode: one of {triangular, triangular2, exp_range, sinus}.
    #       Default 'triangular'.
    #       Values correspond to policies detailed above.
    #       If scale_fn is not None, this argument is ignored.
    #   gamma: constant in 'exp_range' scaling function:
    #       gamma**(cycle iterations)
    #   scale_fn: Custom scaling policy defined by a single
    #       argument lambda function, where
    #       0 <= scale_fn(x) <= 1 for all x >= 0.
    #       mode paramater is ignored
    #   scale_mode: {'cycle', 'iterations'}.
    #       Defines whether scale_fn is evaluated on
    #       cycle number or cycle iterations (training
    #       iterations since start of cycle). Default is 'cycle'.

    ########
    if(is.null(scale_fn)==TRUE){
      if(mode=='triangular'){scale_fn <- function(x) 1; scale_mode <- 'cycle';}
      if(mode=='triangular2'){scale_fn <- function(x) 1/(2^(x-1)); scale_mode <- 'cycle';}
      if(mode=='exp_range'){scale_fn <- function(x) gamma^(x); scale_mode <- 'iterations';}
      if(mode=='sinus'){scale_fn <- function(x) 0.5*(1+sin(x*pi/2)); scale_mode <- 'cycle';}
    }
    lr <- list()
    if(is.vector(iteration)==TRUE){
      for(iter in iteration){
        cycle <- floor(1 + (iter / (2*step_size)))
        x2 <- abs(iter/step_size-2 * cycle+1)
        if(scale_mode=='cycle') x <- cycle
        if(scale_mode=='iterations') x <- iter
        lr[[iter]] <- base_lr + (max_lr-base_lr) * max(0,(1-x2)) * scale_fn(x)
      }
    }
    lr <- do.call("rbind",lr)
    return(as.vector(lr))
  }
  rib <- function(code){
    ga2 <- mcparallelDo::mcparallelDo(code,targetValue = "ga")
    return(ga2)
  }
  LogMetrics <- R6::R6Class("LogMetrics",
                            inherit = keras::KerasCallback,
                            public = list(
                              loss = NULL,
                              acc = NULL,
                              on_batch_end = function(batch, logs=list()) {
                                self$loss <- c(self$loss, logs[["loss"]])
                                self$acc <- c(self$acc, logs[["accuracy"]])
                              }
                            ))
  as_metrics_df = function(history) {
    # create metrics data frame
    df <- as.data.frame(history$metrics)
    # pad to epochs if necessary
    pad <- history$params$epochs - nrow(df)
    pad_data <- list()
    for (metric in history$params$metrics)
      pad_data[[metric]] <- rep_len(NA, pad)
    df <- rbind(df, pad_data)
    # return df
    df
  }
  callback_lr_init <- function(logs){
    iter <<- 0
    lr_hist <<- c()
    iter_hist <<- c()
  }
  callback_lr_set <- function(batch, logs){
    iter <<- iter + 1
    LR <- l_rate[iter] # if number of iterations > l_rate values, make LR constant to last value
    if(is.na(LR)) LR <- l_rate[length(l_rate)]
    keras::k_set_value(model$optimizer$lr, LR)
  }
  callback_lr_log <- function(batch, logs){
    lr_hist <<- c(lr_hist, keras::k_get_value(model$optimizer$lr))
    iter_hist <<- c(iter_hist, keras::k_get_value(model$optimizer$iterations))
  }
  callback_lr <- keras::callback_lambda(on_train_begin=callback_lr_init, on_batch_begin=callback_lr_set)
  callback_logger <- keras::callback_lambda(on_batch_end=callback_lr_log)

  n_iter <- ceiling(epochs2 * (NROW(testPics2)/batch_size2))
  if(n_iter < 75){
    n_iter <-75
  }
  dir.create(pathOut,showWarnings = F)
  dir.create(paste0(pathOut,"pdf/"),showWarnings = F)

  l_rate <- Cyclic_LR(iteration=1:n_iter, base_lr=Learning_rate_l2, max_lr=Learning_rate_h2, step_size=floor(n_iter/75),
                      mode='exp_range', gamma=0.99997, scale_fn=NULL, scale_mode='cycle')
  plot(l_rate, type="b", pch=16, xlab="iter", cex=0.2, ylab="learning rate", col="black")

  earlyS <- kerasR::EarlyStopping(patience = patience2)
  model <- kerasCreateModelCNN_2d(optimizer = optimizer2,
                                  denseLoop = denseLoop2,
                                  testPics = testPics2,
                                  trainLabels = trainLabels2,
                                  epochs = epochs2,
                                  batch_size = batch_size2,
                                  dropout = dropout2,
                                  patience = patience2,
                                  validation_split = validation_split2,
                                  convolutionalLoop = convolutionalLoop2,
                                  NO_pooling = NO_pooling2
                                  )
  # fit without data_augmentation ---------------------
  callback_log_acc_clr2 <- LogMetrics$new()
  history <- model %>% generics::fit(
    testPics2,
    trainLabels2,
    batch_size = batch_size2,
    epochs = epochs2,
    validation_split = validation_split2,
    shuffle = TRUE,
    callbacks = list(callback_lr,callback_logger,callback_log_acc_clr2,earlyS),#
    verbose = 2
    )
  valAcc <- history$metrics$val_accuracy[length(history$metrics$val_accuracy)]
  valLoss <- history$metrics$val_loss[length(history$metrics$val_loss)]
  acc <- history$metrics$accuracy[length(history$metrics$accuracy)]
  loss <- history$metrics$loss[length(history$metrics$loss)]

  epoCount <- length(history$metrics$accuracy)
  rib(
  model %>% keras::save_model_hdf5(paste0(pathOut,as.character(count1),"_va_", round(valAcc,digits = 2),
                                   "_vl_", round(valLoss,digits = 2),
                                   "_up_", round(denseLoop2,digits = 2),
                                   "_ep_", round(epoCount, digits = 2),
                                   "_bz_", round(batch_size2, digits = 2),
                                   "_dr_", round(dropout2, digits = 2),
                                   "_pa_", round(patience2, digits = 2),
                                   "_vs_", round(validation_split2,digits = 2),
                                   "_fi_", round(convolutionalLoop2, digits = 2),
                                   "_po_", round(NO_pooling2, digits = 2), ".h5")
  ))


  his <- as_metrics_df(history)
  xrange <- 1:nrow(his)

  meltedAcc <- reshape2::melt(cbind(his[,c(2,4)],xrange), id.vars=c("xrange"))
  meltedLoss <- reshape2::melt(cbind(his[,c(1,3)],xrange), id.vars=c("xrange"))

  p1 <- ggplot2::ggplot(meltedAcc, ggplot2::aes(x= xrange, y = value,group=variable,
                                   colour=variable)) +
    ggplot2::geom_line()+
    ggplot2::geom_point()+
    ggplot2::theme_bw()+
    ggplot2::scale_x_continuous( name="Epoch", limits=c(0, nrow(his)+1),expand = c(0, 0),breaks=seq(0,nrow(his),50) )+
    ggplot2::scale_y_continuous( name="Accuracy",expand = c(0, 0),limits=c(0,1))

  p1
  p2 <- ggplot2::ggplot(meltedLoss, ggplot2::aes(x= xrange,
                                        y = value,
                                        group=variable,
                                   colour=variable))+
    ggplot2::geom_line()+
    ggplot2::geom_point()+
    ggplot2::theme_bw()+
    ggplot2::scale_x_continuous( name="Epoch", limits=c(0, nrow(his)+1),expand = c(0, 0), breaks=seq(0,nrow(his),50))+
    ggplot2::scale_y_continuous( name="Loss",expand = c(0, 0))
  p2
  p3 <- cowplot::plot_grid(p1, p2, labels = c('A', 'B'))
  grDevices::pdf(paste0(pathOut,"pdf/",as.character(count1),".pdf"))
  print(p3)
  dev.off()

  count1 <- count1+1
  assign("count1", count1, envir = .GlobalEnv)
  return(list(Score = 1/loss,Pred = 1/valLoss))
}



#' Constructs a list of the true cleavages
#'
#' Searches for the true cleavages in dirs called true at the depth of recursion
#' The list can be used to filer away false cleavages from the original dataset
#' @param pathToTrue Path to where to recursively look for true cleavages (dirs called true)
#' @param recursionLevel Depth of recursion where the true dirs should be found at
#' @keywords smartPARE_ListTrue
#' @export
#' @examples
#' smartPARE_ListTrue(pathToTrue = "keras/3categories/",
#' recursionLevel = 3 )
smartPARE_ListTrue <- function(pathToTrue = "data/example/", recursionLevel = 2 ){
  ###########################Explanation#####################
  #Function used to gather info about what pictures are true hits
  findNthCharFromEnd <- function(pattern,string,n){
    stringBack <- rev(unlist(gregexpr(pattern, string)))[n]
    return(stringBack)
  }
  if(recursionLevel == 0){
    nam <- pathToTrue
  }else{
    nam <- substring(pathToTrue,findNthCharFromEnd(pattern = "/", string = pathToTrue, n = recursionLevel)+1)
  }
  fileToTrue <- gsub(pattern = "/", replacement = "_", x = nam)
  inPotFiles <- list.files(pathToTrue,recursive = T)
  inPotFilesT <- inPotFiles[grep(x = inPotFiles, pattern = "true")]
  fileRes <- paste0(fileToTrue,"results/")
  resultDir <- paste0(pathToTrue,fileRes)
  dir.create(resultDir)
  if(any(startsWith(inPotFilesT,prefix =  fileRes))) inPotFilesT <- inPotFilesT[-which(startsWith(inPotFilesT,prefix =  fileRes))]
  if(any(startsWith(inPotFilesT,prefix =  "trueCleavagesDf"))) inPotFilesT <- inPotFilesT[-which(startsWith(inPotFilesT,prefix =  "trueCleavagesDf"))]
  write(x = inPotFilesT, file = paste0(resultDir,"true.txt"), sep = "\t" , append = F)#, row.names = F, col.names = F)
}

#' Parses the smartPARE output file into a dataset
#'
#' Parses the smartPARE output file into a dataset based on the path of each true image
#' @param smartPAREresultFile smartPARE output filepath
#' @keywords smartPARE_parse
#' @export
#' @examples
#' smartPARE_parse(smartPAREresultFile = paste0(homePath1,"example_results/true.txt"))
#'
smartPARE_parse <- function(smartPAREresultFile = paste0(homePath1,"example_results/true.txt")){
  truePath <- readLines(con = smartPAREresultFile)
  #splits str in pieces
  undrLines <- length(gregexpr(truePath[1],pattern = "_")[[1]])
  trueCleavagesDf <- stringr::str_split_fixed(truePath, "_", undrLines+1)
  transc <- stringr::str_split_fixed(trueCleavagesDf[,undrLines-1], "/", 3)
  transc2 <- transc[,c(3)]
  upperPath <- trueCleavagesDf[,undrLines-2]
  upperPath2 <- length(gregexpr(upperPath[1],pattern = "/")[[1]])
  dsName <- stringr::str_split_fixed(trueCleavagesDf[,undrLines-2], "/", upperPath2)
  dsName2 <- dsName[upperPath2]
  trueCleavagesDf2 <- data.frame(cbind(dsName2,
                                       trueCleavagesDf[,c(3,4)],
                                       transc2),
                                 stringsAsFactors = F)
  colnames(trueCleavagesDf2) <- c("ds", "posT", "ext", "transcript")
  return(trueCleavagesDf2)
}

#' Train the smartPARE CNN model
#'
#'
#' Build the CNN model based on directories with cleavage images; train/goodUp (true cleavages on 5' strand), train/goodDown (true cleavages on 3' strand) and train/bad (false cleavages)
#' @param homePath1 Path to directory containing a directory with the following subdirs train/goodUp (true cleavages on 5' strand), train/goodDown (true cleavages on 3' strand) and train/bad (false cleavages)
#' @param pixels1 Number of pixels to convert each image to
#' @param search_bound List of min and max values for the following variables: denseLoop2, epochs2, batch_size2, validation_split2, convolutionalLoop2 and NO_pooling2
#' @param n_iter Number of iterations to run the Bayesian optimization
#' @keywords smartPARE_train
#' @export
#' @examples
#' smartPARE_train(homePath1 = "example/",
#' pixels1 = 28,
#' search_bound = list(denseLoop2 = c(0,4),
#'                     epochs2 = c(100, 300),
#'                     batch_size2 = c(32,128),
#'                   dropout2 = c(0, 0.3),
#'                     validation_split2 = c(0.1,0.4),
#'                  convolutionalLoop2 = c(1,4),
#'                  NO_pooling2 = c(1,2)
#' ),
#' n_iter = 100)
smartPARE_train <- function(homePath1 = paste0(system.file("example/",package = "smartPARE"),"/"),
                            pixels1 = 28,
                            search_bound = list(denseLoop2 = c(0,4),
                                                epochs2 = c(100, 300),
                                                batch_size2 = c(32,128),
                                                dropout2 = c(0, 0.3),
                                                validation_split2 = c(0.1,0.4),
                                                convolutionalLoop2 = c(1,4),
                                                NO_pooling2 = c(1,2)
                            ),
                            n_iter = 100){
  getStartVals <- function(search_grid1 = search_grid[,4]){
    digits <- function(x) nchar( trunc( abs(x) ) )
    has.decimal <- function(x) return(grepl("[\\.][[:digit:]]",x))


    range1 <- as.numeric(unlist(search_grid1))
    if(!any(has.decimal(range1))){
      startVals <- sort(sample(seq(range1[1],range1[2]), 2,replace = F))
    }else{
      nDigits <- digits(range1[1])
      startVals <- sort(round(runif(min = range1[1], max = range1[2], n = 2 ), digits = nDigits+1))
    }
    return(startVals)
  }
  as.numeric.if.possible <- function(potentialNumber){
    if(!is.na(suppressWarnings(as.numeric(potentialNumber)))) return(as.numeric(potentialNumber))
    else return(potentialNumber)
  }
  #1
  #Create the training dataset
  kerasCreateDataset_2d(homePath = homePath1 ,pixels = pixels1)
  #2

  #Tune the cyclical learning rate
  lr_max1 <- 0.1
  tuneCLR(batch_size2 = 64,
          epochs_find_LR = 20,
          lr_max = lr_max1,
          optimizer2 = keras::optimizer_sgd(lr=lr_max1, decay=0), #optimizer_rmsprop(lr=lr_max, decay=0),
          validation_split2 = 0.2,
          rollmeanSplit = 3
  )

  #3
  #Double check the assignment of the Learning_rates
  #If you need to change the Learning_rate_l or Learning_rate_h manually
  #The algorithm is a bit shaky for non-smooth curves
  #Learning_rate_l should be at minimum of the curve and Learning_rate_h at max
  #Learning_rate_l = 5e-04 #1e-02
  #Learning_rate_h = 1*10^-3


  rm1 <- 4
  lrl <- "start"
  lrh <- "start"

  while(lrl == "replot" | lrh == "replot" | lrl == "start" | lrh == "start" ){
    plot(zoo::rollmean(accDforig$lr, rm1),
         zoo::rollmean(accDforig$acc, rm1),
         log="x", type="l", pch=16, cex=0.3,
         xlab="learning rate", ylab="accuracy: rollmean(100)")
    abline(v=Learning_rate_l, col="blue")
    abline(v=Learning_rate_h, col="red")

    print("Lower learning rate should be assigned the bottom value of the slope" )
    print("Higher learning rate should be assigned the top value of the slope" )
    print("A rollmean function smoothens out the slope,")
    print("a greater rollmean width -> a smoother slope (default=4)")
    print("replot to change rollmean width")

    print(paste0("The lower learning rate is automatically assigned ",Learning_rate_l))
    lrl <- readline("Are you fine with that? (y/numeric replacement/replot) ")
    lrl <- as.numeric.if.possible(lrl)

    if(tolower(lrl) == "replot"){
      rm1 <- NA
      while(is.na(rm1)){
        rm1 <- suppressWarnings(as.numeric(readline("Replotting, please assign new rollmean width. (number, default = 4) ")))
      }
      next
    }
    if(!any(lrl == "replot" | lrl == "y" | lrl == "" |
            is.numeric(lrl))
    ){
      print(paste0("You asigned lrl ", lrl, " only  y/numeric/replot are ok"))
      lrl <- "replot"
      cat("n/")
      next
    }

    print(paste0("The higher learning rate is automatically assigned ",Learning_rate_h))
    lrh <- readline("Are you fine with that? (y/Numeric replacement/replot) ")
    lrh <- as.numeric.if.possible(lrh)
    if(tolower(lrh) == "replot"){
      rm1 <- NA
      while(is.na(rm1)){
        rm1 <- suppressWarnings(as.numeric(readline("Replotting, please assign new rollmean width. (number) ")))
      }
      next
    }

    if(!any(lrh == "replot" | lrh == "y" | lrh == "" | is.numeric(lrh))){
      print(paste0("You asigned lrh ", lrh, " only  y/numeric/replot are ok"))
      lrh <- "replot"
      cat("n/")
      next
    }

  }
  if(is.numeric(lrl)) Learning_rate_l <- lrl
  if(is.numeric(lrh)) Learning_rate_h <- lrh

  #4
  #automatic generation of start grid
  search_grid <- lapply(search_bound,getStartVals)

  #5
  #Initiate the model counting, it is run in the global environment
  #and run the Bayesian optimization
  #Makes sure to set the other standards of the input variables of your
  #choice in the function defitition of runCLR
  #for instance the pathOut - dir of your output
  count1 <-1
  assign("homePath1", homePath1, envir = .GlobalEnv)
  assign("lr_max1", lr_max1, envir = .GlobalEnv)
  assign("count1", count1, envir = .GlobalEnv)
  bayes_ucb <-
    rBayesianOptimization::BayesianOptimization(FUN = runCLR,
                                                bounds = search_bound,
                                                init_grid_dt = search_grid,
                                                init_points = 0,
                                                n_iter = n_iter,
                                                acq =  "ucb" #"ei" "ucb"
    )

  #6
  #Check which model you prefer
  order(bayes_ucb$Pred, decreasing = T)
  bayes_ucb[which(bayes_ucb$Pred == max(bayes_ucb$Pred))]
  1/max(bayes_ucb$Pred)
  return(bayes_ucb)
}
kristianHoden/smartPARE documentation built on July 3, 2021, 7:10 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
kristianHoden/smartPARE
An R package for efficient identification of true mRNA cleavage sites

R/cleavageModelAndPrediction.R
In kristianHoden/smartPARE: An R package for efficient identification of true mRNA cleavage sites

Defines functions smartPARE_train smartPARE_parse smartPARE_ListTrue runCLR tuneCLR smartPARE_examineCleavages kerasCreateModelCNN_2d kerasCreateDataset_2d

Documented in kerasCreateDataset_2d kerasCreateModelCNN_2d runCLR smartPARE_examineCleavages smartPARE_ListTrue smartPARE_parse smartPARE_train tuneCLR

R Package Documentation

Browse R Packages

We want your feedback!

kristianHoden/smartPARE An R package for efficient identification of true mRNA cleavage sites

R/cleavageModelAndPrediction.R In kristianHoden/smartPARE: An R package for efficient identification of true mRNA cleavage sites

Defines functions smartPARE_train smartPARE_parse smartPARE_ListTrue runCLR tuneCLR smartPARE_examineCleavages kerasCreateModelCNN_2d kerasCreateDataset_2d

Documented in kerasCreateDataset_2d kerasCreateModelCNN_2d runCLR smartPARE_examineCleavages smartPARE_ListTrue smartPARE_parse smartPARE_train tuneCLR

R Package Documentation

Browse R Packages

We want your feedback!

kristianHoden/smartPARE
An R package for efficient identification of true mRNA cleavage sites

R/cleavageModelAndPrediction.R
In kristianHoden/smartPARE: An R package for efficient identification of true mRNA cleavage sites