In cpanse/NestLink: NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles

knitr::opts_chunk$set(echo = TRUE)

The following content is described in more detail in @NestLink, (under review NMETH-A35040).

Load Package

library(NestLink)

Load Data

library(ExperimentHub)

eh <- ExperimentHub()

query(eh, "NestLink")

Define Input & Output-Folder

# dataFolder <- file.path(path.package(package = 'NestLink'), 'extdata')
# expFile <- list.files(dataFolder, pattern='*.fastq.gz', full.names = TRUE)

expFile <- query(eh, c("NestLink", "NL42_100K.fastq.gz"))[[1]]
scratchFolder <- tempdir()
setwd(scratchFolder)

Load known NanoBodies (NB) for QC Checks

For data QC some known NB can be spiked in. Here we load the NB DNA sequences and translate them to the corresponding AA sequences.

# knownNB_File <- list.files(dataFolder,
#      pattern='knownNB.txt', full.names = TRUE)
knownNB_File <- query(eh, c("NestLink", "knownNB.txt"))[[1]]

knownNB_data <- read.table(knownNB_File, sep='\t',
      header = TRUE, row.names = 1, stringsAsFactors = FALSE)

knownNB <- Biostrings::translate(DNAStringSet(knownNB_data$Sequence))
names(knownNB) <- rownames(knownNB_data)
knownNB <- sapply(knownNB, toString)

Set Example Parameters

The workflow uses the first 100 reads only for a rapid processing time.

param <- list()
param[['nReads']] <- 100 #Number of Reads from the start of fastq file to process
param[['maxMismatch']] <- 1 #Number of accepted mismatches for all pattern search steps
param[['NB_Linker1']] <- "GGCCggcggGGCC" #Linker Sequence left to nanobody
param[['NB_Linker2']] <- "GCAGGAGGA" #Linker Sequence right to nanobody
param[['ProteaseSite']] <- "TTAGTCCCAAGA" #Sequence next to flycode
param[['FC_Linker']] <- "GGCCaaggaggcCGG" #Linker Sequence next to flycode
param[['knownNB']] <- knownNB
param[['minRelBestHitFreq']] <- 0.8 #minimal fraction of the dominant nanobody for a specific flycode
param[['minConsensusScore']] <- 0.9 #minimal fraction per sequence position in nanabody consensus sequence calculation
param[['minNanobodyLength']] <- 348 #minimal nanobody length in [nt]
param[['minFlycodeLength']] <- 33  #minimal flycode length in [nt]
param[['FCminFreq']] <- 1 #minimal number of subreads for a specific flycode to keep it in the analysis

Run NGS Workflow

The following steps are included:

read FASTQ
filter
extract
translate into AA sequences

system.time(NB2FC <- runNGSAnalysis(file = expFile[1], param))

Create Input FASTA File for Proteomics Analysis

head(NB2FC, 2)

head(nanobodyFlycodeLinking.as.fasta(NB2FC))

To analyze the expressed Flycodes mass spectrometry is used. the FASTA file containing the Nanobody - Flycode linkage can be written to a file using, e.g., using cat.

the exec directory provides alternative shell scripts using command line GNU tools and AWK.

References

cpanse/NestLink documentation built on May 16, 2022, 2:33 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cpanse/NestLink
NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles

In cpanse/NestLink: NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles

Load Package

Load Data

Define Input & Output-Folder

Load known NanoBodies (NB) for QC Checks

Set Example Parameters

Run NGS Workflow

Create Input FASTA File for Proteomics Analysis

References

R Package Documentation

Browse R Packages

We want your feedback!

cpanse/NestLink NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles

In cpanse/NestLink: NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles

Load Package

Load Data

Define Input & Output-Folder

Load known NanoBodies (NB) for QC Checks

Set Example Parameters

Run NGS Workflow

Create Input FASTA File for Proteomics Analysis

References

R Package Documentation

Browse R Packages

We want your feedback!

cpanse/NestLink
NestLink an R data package to guide through Engineered Peptide Barcodes for In-Depth Analyzes of Binding Protein Ensembles