estimateOffTargets | R Documentation |
Functions to use BLAST to align TAP-seq primers against a genome and chromosome reference to estimate potential off-target binding sites.
createBLASTDb(
genome,
annot,
blastdb,
standard_chromosomes = TRUE,
tx_id = "transcript_id",
tx_name = "transcript_name",
gene_name = "gene_name",
gene_id = "gene_id",
title = "TAP-seq_GT_DB",
verbose = FALSE,
makeblastdb = getOption("TAPseq.makeblastdb")
)
blastPrimers(
object,
blastdb,
max_mismatch = 0,
min_aligned = 0.75,
primer_targets = c("transcript_id", "transcript_name", "gene_id", "gene_name"),
tmpdir = tempdir(),
blastn = getOption("TAPseq.blastn")
)
## S4 method for signature 'TsIO'
blastPrimers(
object,
blastdb,
max_mismatch = 0,
min_aligned = 0.75,
primer_targets = c("transcript_id", "transcript_name", "gene_id", "gene_name"),
tmpdir = tempdir(),
blastn = getOption("TAPseq.blastn")
)
## S4 method for signature 'TsIOList'
blastPrimers(
object,
blastdb,
max_mismatch = 0,
min_aligned = 0.75,
primer_targets = c("transcript_id", "transcript_name", "gene_id", "gene_name"),
tmpdir = tempdir(),
blastn = getOption("TAPseq.blastn")
)
genome |
A |
annot |
A |
blastdb |
TAP-seq BLAST database created with
|
standard_chromosomes |
(logical) Specifies whether only standard chromosomes should be included in output genome sequences (e.g. chr1-22, chrX, chrY, chrM for homo sapiens). |
tx_id, tx_name, gene_name, gene_id |
(character) Column names in annot metadata containing transcript id, transcript name, gene name and gene id information. |
title |
Optional title for BLAST database. |
verbose |
(logical) If |
makeblastdb |
Path to the |
object |
A |
max_mismatch |
Maximum number of mismatches allowed for off-target hits (default: 0). |
min_aligned |
Minimum portion of the primer sequence starting from the 3' end that must align for off-target hits (default: 0.75). |
primer_targets |
Specifies what should be used to identify primer targets for off-target
identification. I.e. to what does the |
tmpdir |
Directory needed to store temporary files. |
blastn |
Path (character) to the |
createBLASTDb
creates a BLAST database containing genome and transcriptome sequences,
which is required by blastPrimers
. The created database contains both sequence files for
BLAST and annotations to process the results.
Use blastPrimers
to align designed TAP-seq primers against the created database to
estimate off-target priming potential. Only hits where at least a specified portion of the
sequence involving the 3' end of the primer aligns with not more than a certain number of
mismatches are considered.
blastPrimers
counts the number of genes in which a primer has 1) exonic hits or 2)
intronic hits, or 3) the number of hits in intergenic regions of the genome. The exonic and
intronic counts should be interptreted as: "In how many genes does a primer have exonic
(or intronic) hits?".
If a BLAST hit falls in both intronic and exonic regions of a given gene (i.e. exonic for one transcript, intronic for another transcript), only the exonic hit is counted for that gene. If a primer has for instance 3 BLAST hits in one gene, 2 exonic and 1 intronic, then one exonic hit and one intronic hit is counted for that gene.
If sequence IDs of the designed primers (sequence_id
) refer to
the target gene/transcripts and can be found in the BLAST database annotations via
primer_targets
, then only off-target hits are counted. This is usually the case if input
for primer design was produced from target gene annotations.
For createBLASTDb
a directory containing the BLAST database. For
blastPrimers
a TsIO
or
TsIOList
object with the number of potential off-targets
added to the TAP-seq primer metadata.
createBLASTDb()
: Create a genome and transcriptome TAP-seq BLAST database
blastPrimers(TsIO)
: BLAST primers in a TsIO
object
blastPrimers(TsIOList)
: BLAST primers in a TsIOList
object
## Not run:
library(BSgenome)
# human genome (hg38) BSgenome object
hg38 <- getBSgenome("BSgenome.Hsapiens.UCSC.hg38")
# get annotations for BLAST
annot_url <- paste0("ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/",
"gencode.v32.annotation.gtf.gz")
annot <- import(annot_url, format = "gtf")
blast_exons <- annot[annot$type == "exon" & annot$gene_type == "protein_coding"]
# build BLAST database
blastdb <- file.path(tempdir(), "blastdb")
createBLASTDb(genome = hg38, annot = blast_exons, blastdb = blastdb)
# chr11 primers example data (already contains off-targets, but we can overwrite them)
data("chr11_primers")
chr11_primers <- chr11_primers[1:3] # only use a small subset for this example
# run blast to identify potential off-targets
chr11_primers <- blastPrimers(chr11_primers, blastdb = blastdb)
tapseq_primers(chr11_primers)
# allow 1 mismatch between primer and off-target
chr11_primers <- blastPrimers(chr11_primers, blastdb = blastdb, max_mismatch = 1)
tapseq_primers(chr11_primers)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.