searchSeq-methods: searchSeq method

Description Usage Arguments Value Author(s) References See Also Examples

Description

It scans a nucleotide sequence with the pattern represented by a PWMatrix and identifies putative transcription factor binding sites.

Usage

1
2
  searchSeq(x, subject, seqname="Unknown", strand="*", min.score="80%",
            mc.cores=1L)

Arguments

x

PWMatrix or PWMatrixList object.

subject

A DNAStringSet, DNAString, XStringViews or MaskedDNAString object that will be scanned.

seqname

This is sequence name of the target sequence. If subject is a DNAStringSet, the names of the DNAStringSet object will be used.

strand

When searching the sequence, we can search the positive strand or negative strand. While strand is "*", it will search both strands and return the results based on the positvie strand coordinate.

min.score

The minimum score for the hit. Can be given an character string in the format of "80%" or as a single absolute value between 0 and 1. When it is percentage value, it represents the quantile between the minimal and the maximal possible value from the PWM.

mc.cores

integer(1): The number of cores to use. It is only used when ‘x’ is a PWMatrixList object and not available on windows platform.

Value

A SiteSet object is returned when x is a PWMatrix object. A SiteSetList object is returned when x is a PWMatrixList or subject is a DNAStringSet.

Author(s)

Ge Tan

References

Wasserman, W. W., & Sandelin, A. (2004). Applied bioinformatics for the identification of regulatory elements. Nature Publishing Group, 5(4), 276-287. doi:10.1038/nrg1315

See Also

searchAln, matchPWM

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
  data(MA0003.2)
  data(MA0004.1)
  pwm1 <- toPWM(MA0003.2)
  pwm2 <- toPWM(MA0004.1)
  pwmList <- PWMatrixList(pwm1=pwm1, pwm2=pwm2)
  seq1 <- "GAATTCTCTCTTGTTGTAGCATTGCCTCAGGGCACACGTGCAAAATG"
  seq2 <- "GTTTCACCATTGCCTCAGGGCATAAATATATAAAAAAATATAATTTTCATC"
  
  # PWMatrix, character
  ## Only scan the positive strand of the input sequence
  siteset <- searchSeq(pwm1, seq1, seqname="seq1", strand="+", min.score="80%")
  siteset <- searchSeq(pwm1, seq1, seqname="seq1", strand="+", min.score=0.8)
  ## Only scan the negative strand of the input sequence
  siteset <- searchSeq(pwm1, seq1, seqname="seq1", strand="-", min.score="80%")
  ## Scan both strands of the input sequences
  siteset <- searchSeq(pwm1, seq1, seqname="seq1", strand="*", min.score="80%")
  ## Convert the SiteSet object into other R objects
  as(siteset, "data.frame")
  as(siteset, "DataFrame")
  as(siteset, "GRanges")
  writeGFF3(siteset)
  writeGFF2(siteset)
  
  # PWMatrixList, character
  sitesetList <- searchSeq(pwmList, seq1, seqname="seq1", strand="*", 
                           min.score="80%")
  sitesetList <- searchSeq(pwmList, seq1, seqname="seq1", strand="*", 
                           min.score="80%", mc.cores=1L)
  
  ## Convert the SiteSteList object into other R objects
  as(sitesetList, "data.frame")
  as(sitesetList, "DataFrame")
  as(sitesetList, "GRanges")
  writeGFF3(sitesetList)
  writeGFF2(sitesetList)

  # PWMatrix, DNAStringSet
  library(Biostrings)
  seqs <- DNAStringSet(c(seq1=seq1, seq2=seq2))
  sitesetList <- searchSeq(pwm1, seqs, min.score="80%")

  # PWMatrixList, DNAStringSet
  sitesetList <- searchSeq(pwmList, seqs, min.score="80%")

TFBSTools documentation built on Nov. 8, 2020, 8:14 p.m.