offTargetAnalysisOfPeakRegions: Offtarget Analysis of GUIDE-seq peaks

View source: R/offTargetAnalysisOfPeakRegions.R

offTargetAnalysisOfPeakRegionsR Documentation

Offtarget Analysis of GUIDE-seq peaks

Description

Finding offtargets around peaks from GUIDE-seq or around any given genomic regions

Usage

offTargetAnalysisOfPeakRegions(
  gRNA,
  peaks,
  format = c("fasta", "bed"),
  peaks.withHeader = FALSE,
  BSgenomeName,
  overlap.gRNA.positions = c(17, 18),
  upstream = 25L,
  downstream = 25L,
  PAM.size = 3L,
  gRNA.size = 20L,
  PAM = "NGG",
  PAM.pattern = "NNN$",
  max.mismatch = 6L,
  outputDir,
  allowed.mismatch.PAM = 2L,
  overwrite = TRUE,
  weights = c(0, 0, 0.014, 0, 0, 0.395, 0.317, 0, 0.389, 0.079, 0.445, 0.508, 0.613,
    0.851, 0.732, 0.828, 0.615, 0.804, 0.685, 0.583),
  orderOfftargetsBy = c("predicted_cleavage_score", "n.mismatch"),
  descending = TRUE,
  keepTopOfftargetsOnly = TRUE,
  scoring.method = c("Hsu-Zhang", "CFDscore"),
  subPAM.activity = hash(AA = 0, AC = 0, AG = 0.259259259, AT = 0, CA = 0, CC = 0, CG =
    0.107142857, CT = 0, GA = 0.069444444, GC = 0.022222222, GG = 1, GT = 0.016129032, TA
    = 0, TC = 0, TG = 0.038961039, TT = 0),
  subPAM.position = c(22, 23),
  PAM.location = "3prime",
  mismatch.activity.file = system.file("extdata",
    "NatureBiot2016SuppTable19DoenchRoot.csv", package = "CRISPRseek"),
  n.cores.max = 1
)

Arguments

gRNA

gRNA input file path or a DNAStringSet object that contains gRNA plus PAM sequences used for genome editing

peaks

peak input file path or a GenomicRanges object that contains genomic regions to be searched for potential offtargets

format

Format of the gRNA and peak input file. Currently, fasta and bed are supported for gRNA and peak input file respectively

peaks.withHeader

Indicate whether the peak input file contains header, default FALSE

BSgenomeName

BSgenome object. Please refer to available.genomes in BSgenome package. For example, BSgenome.Hsapiens.UCSC.hg19 for hg19, BSgenome.Mmusculus.UCSC.mm10 for mm10, BSgenome.Celegans.UCSC.ce6 for ce6, BSgenome.Rnorvegicus.UCSC.rn5 for rn5, BSgenome.Drerio.UCSC.danRer7 for Zv9, and BSgenome.Dmelanogaster.UCSC.dm3 for dm3

overlap.gRNA.positions

The required overlap positions of gRNA and restriction enzyme cut site, default 17 and 18 for SpCas9.

upstream

upstream offset from the peak start to search for off targets, default 20

downstream

downstream offset from the peak end to search for off targets, default 20

PAM.size

PAM length, default 3

gRNA.size

The size of the gRNA, default 20

PAM

PAM sequence after the gRNA, default NGG

PAM.pattern

Regular expression of protospacer-adjacent motif (PAM), default to any NNN$. Set it to (NAG|NGG|NGA)$ if only outputs offtargets with NAG, NGA or NGG PAM

max.mismatch

Maximum mismatch allowed in off target search, default 6

outputDir

the directory where the off target analysis and reports will be written to

allowed.mismatch.PAM

Number of degenerative bases in the PAM.pattern sequence, default to 2

overwrite

overwrite the existing files in the output directory or not, default FALSE

weights

a numeric vector size of gRNA length, default c(0, 0, 0.014, 0, 0, 0.395, 0.317, 0, 0.389, 0.079, 0.445, 0.508, 0.613, 0.851, 0.732, 0.828, 0.615, 0.804, 0.685, 0.583) for SPcas9 system, which is used in Hsu et al., 2013 cited in the reference section. Please make sure that the number of elements in this vector is the same as the gRNA.size, e.g., pad 0s at the beginning of the vector.

orderOfftargetsBy

criteria to order the offtargets by and the top one will be kept if keepTopOfftargetsOnly is set to TRUE. If set to predicted_cleavage_score (descending order), the offtarget with the highest predicted cleavage score for each peak will be kept. If set to n.mismatch (ascending order), the offtarget with the smallest number of mismatch to the target sequence for each peak will be kept.

descending

No longer used. In the descending or ascending order. Default to order by predicted cleavage score in descending order and number of mismatch in ascending order When altering orderOfftargetsBy order, please also modify descending accordingly

keepTopOfftargetsOnly

Output all offtargets or the top offtarget per peak using the orderOfftargetsBy criteria, default to the top offtarget

scoring.method

Indicates which method to use for offtarget cleavage rate estimation, currently two methods are supported, Hsu-Zhang and CFDscore

subPAM.activity

Applicable only when scoring.method is set to CFDscore A hash to represent the cleavage rate for each alternative sub PAM sequence relative to preferred PAM sequence

subPAM.position

Applicable only when scoring.method is set to CFDscore The start and end positions of the sub PAM. Default to 22 and 23 for SP with 20bp gRNA and NGG as preferred PAM

PAM.location

PAM location relative to gRNA. For example, default to 3prime for spCas9 PAM. Please set to 5prime for cpf1 PAM since it's PAM is located on the 5 prime end

mismatch.activity.file

Applicable only when scoring.method is set to CFDscore A comma separated (csv) file containing the cleavage rates for all possible types of single nucleotide mismatch at each position of the gRNA. By default, using the supplemental Table 19 from Doench et al., Nature Biotechnology 2016

n.cores.max

Indicating maximum number of cores to use in multi core mode, i.e., parallel processing, default 1 to disable multicore processing for small dataset.

Value

a tab-delimited file offTargetsInPeakRegions.tsv, containing all input peaks with potential gRNA binding sites, mismatch number and positions, alignment to the input gRNA and predicted cleavage score.

Author(s)

Lihua Julie Zhu

References

Patrick D Hsu, David A Scott, Joshua A Weinstein, F Ann Ran, Silvana Konermann, Vineeta Agarwala, Yinqing Li, Eli J Fine, Xuebing Wu, Ophir Shalem,Thomas J Cradick, Luciano A Marraffini, Gang Bao & Feng Zhang (2013) DNA targeting specificity of rNA-guided Cas9 nucleases. Nature Biotechnology 31:827-834 Lihua Julie Zhu, Benjamin R. Holmes, Neil Aronin and Michael Brodsky. CRISPRseek: a Bioconductor package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing systems. Plos One Sept 23rd 2014 Lihua Julie Zhu (2015). Overview of guide RNA design tools for CRISPR-Cas9 genome editing technology. Frontiers in Biology August 2015, Volume 10, Issue 4, pp 289-296

See Also

GUIDEseq

Examples


#### the following example is also part of annotateOffTargets.Rd
if (interactive())
{
    library("BSgenome.Hsapiens.UCSC.hg19")
    library(GUIDEseq)
    peaks <- system.file("extdata", "T2plus100OffTargets.bed",
        package = "CRISPRseek")
    gRNAs <- system.file("extdata", "T2.fa",
        package = "CRISPRseek")
    outputDir = getwd()
    offTargets <- offTargetAnalysisOfPeakRegions(gRNA = gRNAs, peaks = peaks,
        format=c("fasta", "bed"),
        peaks.withHeader = TRUE, BSgenomeName = Hsapiens,
        upstream = 25L, downstream = 25L, PAM.size = 3L, gRNA.size = 20L,
        orderOfftargetsBy = "predicted_cleavage_score",
        PAM = "NGG", PAM.pattern = "(NGG|NAG|NGA)$", max.mismatch = 2L,
        outputDir = outputDir,
        allowed.mismatch.PAM = 3, overwrite = TRUE
   )
}

LihuaJulieZhu/GUIDEseq documentation built on March 27, 2024, 9:42 p.m.