searchHits: Search for off targets in a sequence as DNAString

Description Usage Arguments Value Author(s) See Also Examples

Description

Search for off targets for given gRNAs, sequence and maximum mismatches

Usage

1
2
3
4
5
searchHits(gRNAs, seqs, seqname, max.mismatch = 3, PAM.size = 3,
                 gRNA.size = 20, PAM = "NGG", PAM.pattern = "NNN$",
                 allowed.mismatch.PAM = 2, PAM.location = "3prime",
                 outfile,
                 baseEditing = FALSE, targetBase = "C", editingWindow = 4:8)

Arguments

gRNAs

DNAStringSet object containing a set of gRNAs. Please note the sequences must contain PAM appended after gRNAs, e.g., ATCGAAATTCGAGCCAATCCCGG where ATCGAAATTCGAGCCAATCC is the gRNA and CGG is the PAM

seqs

DNAString object containing a DNA sequence.

seqname

Specify the name of the sequence

max.mismatch

Maximum mismatch allowed in off target search, default 3. Warning: will be considerably slower if it is set to greater than 3

PAM.size

Size of PAM, default 3

gRNA.size

Size of gRNA, default 20

PAM

PAM as regular expression for appending to the gRNA, default NGG for SpCas9, change to TTTN for cpf1.

PAM.pattern

Regular expression of PAM, default N[A|G]G$ for spCas9. For cpf1, ^TTTN since it is a 5 prime PAM sequence

allowed.mismatch.PAM

Maximum number of mismatches allowed in the offtargets comparing to the PAM sequence. Default to 2 for NGG PAM

PAM.location

PAM location relative to gRNA. For example, spCas9 PAM is located on the 3 prime while cpf1 PAM is located on the 5 prime

outfile

File path to temporarily store the search results

baseEditing

Indicate whether to design gRNAs for base editing. Default to FALSE If TRUE, please set baseEditing = TRUE, targetBase and editingWidow accordingly.

targetBase

Applicable only when baseEditing is set to TRUE. It is used to indicate the target base for base editing systems, default to C for converting C to T in the CBE system. Please change it to A if you intend to use the ABE system.

editingWindow

Applicable only when baseEditing is set to TRUE. It is used to indicate the effective editing window to consider for the offtargets search only, default to 4 to 8 which is for the original CBE system. Please change it accordingly if the system you use have a different editing window, or you would like to include offtargets with the target base in a larger editing window.

Value

a data frame contains IsMismatch.posX (indicator variable indicating whether this position X is mismatch or not, 1 means yes and 0 means not, X = 1 to gRNA.size) representing all positions in the gRNA),strand (strand of the match, + for plus and - for minus strand), chrom (chromosome of the off target), chromStart (start position of the off target), chromEnd (end position of the off target),name (gRNA name), gRNAPlusPAM (gRNA sequence with PAM sequence concatenated), OffTargetSequence (the genomic sequence of the off target), n.mismatch (number of mismatches between the off target and the gRNA), forViewInUCSC (string for viewing in UCSC genome browser, e.g., chr14:31665685-31665707), score (set to 100, and will be updated in getOfftargetScore)

Author(s)

Lihua Julie Zhu

See Also

offTargetAnalysis

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 if(interactive())
 {
    all.gRNAs <- findgRNAs(inputFilePath = 
        system.file("extdata", "inputseq.fa", package = "CRISPRseek"),
        pairOutputFile = "pairedgRNAs.xls",
	findPairedgRNAOnly = TRUE)

    library("BSgenome.Hsapiens.UCSC.hg19")
    ### for speed reason, use max.mismatch = 0 for finding all targets with 
    ### all variants of PAM
    hits <- searchHits(all.gRNAs[1], BSgenomeName = Hsapiens,
        max.mismatch = 0, chromToSearch = "chrX")
    colnames(hits)

    ### test PAM located at 5 prime
    all.gRNAs <- findgRNAs(inputFilePath = 
             system.file("extdata", "inputseq.fa", package = "CRISPRseek"),
             pairOutputFile = "pairedgRNAs.xls",
             findPairedgRNAOnly = FALSE,
             PAM = "TGT", PAM.location = "5prime")
     
    library("BSgenome.Hsapiens.UCSC.hg19")
         ### for speed reason, use max.mismatch = 0 for finding all targets with 
         ### all variants of PAM
    hits <- searchHits(all.gRNAs[1], BSgenomeName = Hsapiens, PAM.size = 3,
        max.mismatch = 0, chromToSearch = "chrX", PAM.location = "5prime",
        PAM = "^T[A|G]N", allowed.mismatch.PAM = 2)
    colnames(hits)
}

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: 'Biostrings'

The following object is masked from 'package:base':

    strsplit

Warning messages:
1: multiple methods tables found for 'organism' 
2: multiple methods tables found for 'species' 
3: multiple methods tables found for 'seqinfo' 
4: multiple methods tables found for 'seqinfo<-' 
5: multiple methods tables found for 'seqnames' 
6: replacing previous import 'BiocGenerics::organism' by 'BSgenome::organism' when loading 'CRISPRseek' 
7: replacing previous import 'BiocGenerics::species' by 'BSgenome::species' when loading 'CRISPRseek' 

CRISPRseek documentation built on Jan. 14, 2021, 2:50 a.m.