DeepG4: DeepG4 main function to predict a probability to form a G4,...

View source: R/DeepG4.R

DeepG4R Documentation

DeepG4 main function to predict a probability to form a G4, given a DNA sequence.

Description

DeepG4 main function to predict a probability to form a G4, given a DNA sequence.

Usage

DeepG4(
  X = NULL,
  X.atac = NULL,
  Y = NULL,
  model = NULL,
  lower.case = F,
  treshold = 0.5,
  seq.size = 201,
  retrain = FALSE,
  retrain.path = "",
  log_odds = F
)

Arguments

X

An object of class character,list or DNAStringSet/DNAStringSetList with DNA sequences.

X.atac

a numeric vector of Average accessibility by sequence, with the same size as X (default to NULL).

Y

a numeric vector of 1 and 0 values (default to NULL).

model

a path to a keras model in hdf5 format (default to NULL). Don't change it unless you want to use our function with a custom model.

lower.case

a boolean. Set to TRUE if elements of X are in lower case (default to FALSE).

treshold

numeric value who define the treshold to use to get confusion matrix (default to 0.5).

seq.size

numeric value representing the sequence size accepted by our model. Don't change it unless you want to use our function with a custom model.

retrain

boolean. Set to TRUE if you want to retrain with your own dataset. Need Y to be provided (default to FALSE).

retrain.path

file where retrained model will be saved.

log_odds

a boolean. If set to TRUE then return the logarithm of the odds instead of probability (Layer before the sigmoid activation). Use only to compute a deltaScore between two sequences. Default to TRUE

Details

This function is a wrapper to help people to get a prediction given any DNA sequence(s) of type ACGTN with our DeepG4 model. You don't have to use it to get a DeepG4 prediction, if you're familar with keras and tensorflow, you can access our model in hdf5 package using system.file("extdata", "", package = "DeepG4"). In complement, DNAToNumerical can help you to get the one-hot conversion needed by our model as input. If your sequences > seq.size, they will be cropped and sequences < seq.size, will be filled with zero padding.

Value

if Y = NULL, return DeepG4 prediction for each value of X. if Y is provided, return a list with list(prediction for each value of X,a ggplot2 object representing AUC,a ggplot2 object representing confusion matrix,some metrics)

Examples

library(Biostrings)
library(DeepG4)

sequences <- system.file("extdata", "test_G4_data.fa", package = "DeepG4")
sequences <- readDNAStringSet(sequences)

predictions <- DeepG4(sequences)
head(predictions)

morphos30/DeepG4 documentation built on June 11, 2022, 10:38 p.m.