nanotatoR_main_Trio_SE: Annotation and visualisation of Bionano SV, of DLE Trio...

View source: R/nanotatoRmain_Trio_SE.r

nanotatoR_main_Trio_SER Documentation

Annotation and visualisation of Bionano SV, of DLE Trio samples.

Description

Annotation and visualisation of Bionano SV, of DLE Trio samples.

Usage

nanotatoR_main_Trio_SE(
  smap,
  bed,
  inputfmtBed = c("bed", "BNBed"),
  n = 3,
  buildBNInternalDB = TRUE,
  mergedFiles,
  smappath,
  buildSVInternalDB = FALSE,
  path,
  pattern,
  win_indel_INF = 10000,
  win_inv_trans_INF = 50000,
  perc_similarity_INF = 0.5,
  indelconf = 0.5,
  invconf = 0.01,
  transconf = 0.1,
  perc_similarity_INF_parents = 0.9,
  hgpath,
  win_indel_DGV = 10000,
  win_inv_trans_DGV = 50000,
  perc_similarity_DGV = 0.5,
  method_entrez = c("Single", "Multiple", "Text"),
  termPath,
  term,
  thresh = 5,
  limsize = 1000,
  EnzymeType = c("SVmerge", "SE"),
  labelType = c("SVMerge", "SE", "Both"),
  SVMerge_path,
  SVMerge_pattern,
  SE_path,
  SE_pattern,
  Samplecodes,
  mergeKey,
  mergedKeyoutpath,
  mergedKeyFname,
  RNAseqcombo = TRUE,
  RNASeqDir,
  returnMethod = "dataFrame",
  RNASeqData,
  RNASeqPATH,
  pattern_Proband = NA,
  pattern_Mother = NA,
  pattern_Father = NA,
  outpath,
  outputFilename = "",
  termListPresent = TRUE,
  internalBNDB,
  clinvar,
  InternaldatabasePresent = TRUE,
  RNASeqDatasetPresent = TRUE,
  geneListPresent = TRUE,
  omim,
  gtr,
  removeClinvar = FALSE,
  removeGTR = FALSE,
  downloadClinvar = FALSE,
  downloadGTR = FALSE,
  url_gtr,
  omimID,
  RZIPpath,
  directoryName,
  fileprefix,
  datGeneListPath,
  decipherpath,
  indexfile,
  primaryGenesPresent = TRUE,
  outputType = c("Excel", "csv")
)

Arguments

smap

character. File name for the smap

bed

Text Bionano Bed file.

inputfmtBed

character Whether the bed input is UCSC bed or Bionano bed.

n

numeric Number of genes to report which are nearest to the breakpoint. Default is 3.

buildBNInternalDB

boolean. Checking whether the merged BNDB file database exist.

mergedFiles

character. Path to the merged SV files.

smappath

character. Path and file name for textfile.

buildSVInternalDB

boolean. Checking whether the merged solo file database exist.

path

character. Path to the solo file database.

pattern

character. pattern of the file names to merge.

win_indel_INF

Numeric. Insertion and deletion error window.

win_inv_trans_INF

Numeric. Inversion and translocation error window.

perc_similarity_INF

Numeric . ThresholdPercentage similarity of the query SV and reference SV.

indelconf

Numeric. Threshold for insertion and deletion confidence.

invconf

Numeric. Threshold for inversion confidence.

transconf

Numeric. Threshold for translocation confidence.

perc_similarity_INF_parents

Numeric . ThresholdPercentage similarity for parent zygosity calculation. Default threshold 0.9.

hgpath

character. Path to Database of Genomic Variants (DGV) Text file.

win_indel_DGV

Numeric. Insertion and deletion error window for DGV.

win_inv_trans_DGV

Numeric. Inversion and translocation error window for DGV.

perc_similarity_DGV

Numeric . ThresholdPercentage similarity of the query SV and reference SV, for DGV..

method_entrez

character. Input Method for terms. Choices are "Single","Multiple" and "Text".

termPath

character. Path and file name for textfile.

term

character. Single or Multiple Terms.

thresh

integer. Threshold for the number of terms sent to entrez. Note if large lists are sent to ncbi, it might fail to get processed. Default is 5.

limsize

Numeric. Minimum size for SV. Default 1000.

EnzymeType

Character. Type of enzyme. Options Dual and DLE.

labelType

character. Type of labels used for mapping. Choices are Dual, DLE and Both.

SVMerge_path

character. Path for the Dual labelled cmap

SVMerge_pattern

character. pattern of the dual files.

SE_path

character. Path for the Dual labelled cmap

SE_pattern

character. pattern of the dual files.

Samplecodes

character. File containing relations and IDs associated to them.

mergeKey

character. File containing sample ID and relation.

mergedKeyoutpath

character. File path storing sample name and nanoID key information.

mergedKeyFname

character. File name storing sample name and nanoID key information.

RNAseqcombo

boolean whether RNASeq datasets are combined or not.

RNASeqDir

boolean Directory for RNASeq.

returnMethod

character. Choice between text or data frame as the output.

RNASeqData

dataFrame. RNAseq data with gene names.

RNASeqPATH

character. RNAseq dataset path .

pattern_Proband

character. Pattern for proband.

pattern_Mother

character. Pattern to identify the mother reads.

pattern_Father

character. Pattern to identify the father reads.

outpath

Character Directory to the output file.

outputFilename

Character Output filename.

termListPresent

logical Checks whether term list is provided by the user.

internalBNDB

character. internak Bionano merged databse.

clinvar

character. clinvar file name and location.

InternaldatabasePresent

boolean. Checking whether internal DB present.

RNASeqDatasetPresent

boolean. Checking whether RNASeq database present or not.

geneListPresent

logical Checks whether gene list is provided by the user.

omim

character. omim2gene file name and location.

gtr

character. gtr file name and location.

removeClinvar

logical. Deletes the Clinvar database if TRUE.

removeGTR

logical. Deletes the GTR database if TRUE.

downloadClinvar

logical. Downloads the Clinvar database if TRUE.

downloadGTR

logical. Downloads the GTR database if TRUE.

url_gtr

character. url for GTR.

omimID

character. Omim ID.

RZIPpath

character. Path to RZippath.

directoryName

Directory name where individual SV files will be stored.

fileprefix

character Prefix to use for each of the files in the directory.

datGeneListPath

Character Path for genelist.

decipherpath

character. Decipher database path.

indexfile

character. indexfile containing nano ID and sample relation.

primaryGenesPresent

logical Checks whether primarygene list is provided by the user.

outputType

Variants in excel tabs or in different csv files. Options Excel or csv.

Value

Excel file containing the annotated SV map, tabs divided based on type of SVs.

Text files containg gene list and terms associated with them are stored as text files.

Examples

smapName="GM24385_Ason_DLE1_VAP_trio5.smap"
smap = system.file("extdata", smapName, package="nanotatoR")
bedFile <- system.file("extdata", "HomoSapienGRCH19_lift37.bed", package="nanotatoR")
hgpath=system.file("extdata", "GRCh37_hg19_variants_2016-05-15.txt", package="nanotatoR")
decipherpath = system.file("extdata", "population_cnv.txt", package="nanotatoR")
omim = system.file("extdata", "mim2gene.txt", package="nanotatoR") 
clinvar = system.file("extdata", "localPDB/", package="nanotatoR") 
gtr = system.file("extdata", "gtrDatabase.txt", package="nanotatoR")
mergedFiles = system.file("extdata", "nanotatoRControl.txt", package="nanotatoR")
indexfile = system.file("extdata", "Sample_index.csv", package="nanotatoR")
RNASeqDir = system.file("extdata", "NA12878_P_Blood_S1.genes.results", package="nanotatoR")
path = system.file("extdata", "Bionano_config/", package = "nanotatoR")
pattern = "_hg19.txt"
outputFilename <- "NA12878_DLE1_VAP_solo5_out"
outpath <- system.file("extdata", smapName, package = "nanotatoR")
RZIPpath <- system.file("extdata", "zip.exe", package = "nanotatoR")
nanotatoR_main_Trio_SE(
smap = smap, bed = bedFile, inputfmtBed = c("bed"), 
n=3,EnzymeType = c("SE"),
buildBNInternalDB=TRUE,
 path = path , pattern = pattern, 
buildSVInternalDB = FALSE,
decipherpath = decipherpath,
win_indel_INF = 10000, win_inv_trans_INF = 50000, 
perc_similarity_INF= 0.5, indelconf = 0.5, invconf = 0.01, 
transconf = 0.1, perc_similarity_INF_parents = 0.9,
hgpath = hgpath, win_indel_DGV = 10000, 
win_inv_trans_DGV = 50000, 
perc_similarity_DGV = 0.5, limsize = 1000,
method_entrez=c("Single"), 
term = "Liver cirrhosis", RZIPpath = RZIPpath,
omim = omim, clinvar = clinvar, gtr = gtr, 
removeClinvar = TRUE, removeGTR = TRUE, 
downloadClinvar = FALSE, downloadGTR = FALSE,
RNASeqDatasetPresent = TRUE,
RNAseqcombo = TRUE, geneListPresent = FALSE,
RNASeqDir = RNASeqDir, returnMethod = "dataFrame",
pattern_Proband = "*_P_*",
outpath = outpath,
indexfile = system.file("extdata", "Sample_index.csv",package="nanotatoR"),
primaryGenesPresent = FALSE,
outputFilename = outputFilename, 
termListPresent = FALSE,
InternaldatabasePresent = TRUE,
outputType = c("Excel"))

VilainLab/Nanotator documentation built on Aug. 2, 2024, 8:45 p.m.