runNGSAnalysis: NGS linkage workflow

View source: R/workflow_NGS.r

runNGSAnalysisR Documentation

NGS linkage workflow

Description

performs the NGS filtering workflow to get high quality FlyCode and Nanobody sequences linkage.

Usage

runNGSAnalysis(file, param)

Arguments

file

sequence file path

param

list of input parameters, explained in details paragraph below.

Details

The elements of the parameter list object is described as follows:

  • NB_Linker1 nucleotide sequence of the linker left to the nanobody.

  • NB_Linker2 nucleotide sequence of the linker right to the nanobody.

  • ProteaseSite nucleotide sequence left to the flycode.

  • FC_Linker nucleotide sequence right to the flycode.

  • knownNB known nanobody sequences in the experiment.

  • nReads number of Reads from the start of fastq file to process.

  • minRelBestHitFreq minimal fraction of the dominant nanobody for a specific flycode.

  • minConsensusScore minimal fraction per sequence position in nanabody consensus sequence calculation.

  • maxMismatch number of accepted mismatches for all pattern search steps.

  • minNanobodyLength minimal nanobody length in [nt].

  • minFlycodeLength minimal flycode length in [nt].

  • FCminFreq minimal number of subreads for a specific flycode to keep it in the analysis.

missing elements are replace by the example provided values.

Value

uniqNB2FC dataframe

Author(s)

Lennart Opitz <lopitz@fgcz.ethz.ch>, 2019

Examples

library(ExperimentHub)
eh <- ExperimentHub()
expFile <- query(eh, c("NestLink", "NL42_100K.fastq.gz"))[[1]]
knownNB_File <- query(eh, c("NestLink", "knownNB.txt"))[[1]]    
knownNB_data <- read.table(knownNB_File, sep='\t', header = TRUE,
    row.names = 1, stringsAsFactors = FALSE)
knownNB <- Biostrings::translate(DNAStringSet(knownNB_data$Sequence))
names(knownNB) <- rownames(knownNB_data)
knownNB <- sapply(knownNB, toString)
param <- list()
param[['NB_Linker1']] <- "GGCCggcggGGCC"
param[['NB_Linker2']] <- "GCAGGAGGA"
param[['ProteaseSite']] <- "TTAGTCCCAAGA"
param[['FC_Linker']] <- "GGCCaaggaggcCGG"
param[['knownNB']] <- knownNB
param[['nReads']] <- 10000
param[['minRelBestHitFreq']] <- 0.8
param[['minConsensusScore']] <- 0.9
param[['maxMismatch']] <- 1
param[['minNanobodyLength']] <- 348
param[['minFlycodeLength']] <- 33
param[['FCminFreq']] <- 1
runNGSAnalysis(file = expFile[1], param)

cpanse/NestLink documentation built on May 16, 2022, 2:33 a.m.