summarizePatternInPeaks: Output a summary of the occurrence and enrichment of each...

View source: R/summarizePatternInPeaks.R

summarizePatternInPeaksR Documentation

Output a summary of the occurrence and enrichment of each pattern in the sequences.

Description

Output a summary of the occurrence and enrichment of each pattern in the sequences.

Usage

summarizePatternInPeaks(
  patternFilePath,
  format = "fasta",
  BSgenomeName,
  peaks,
  revcomp = TRUE,
  method = c("binom.test", "permutation.test"),
  expectFrequencyMethod = c("Markov", "Naive"),
  MarkovOrder = 3L,
  bgdForPerm = c("shuffle", "chromosome"),
  chromosome = c("asPeak", "random"),
  nperm = 1000,
  alpha = 0.05,
  ...
)

Arguments

patternFilePath

Character value. The path to the file that contains the pattern.

format

Character value. The format of file containing the oligonucleotide pattern, either "fasta" (default) or "fastq".

BSgenomeName

Character value. BSgenome object. Please refer to available.genomes in BSgenome package for details.

peaks

Character value. GRanges containing the peaks.

revcomp

Boolean value, if TURE, also search the reverse compliment of pattern. Default is TRUE.

method

Character value. Method for pattern enrichment test, 'binom.test' (default) or 'permutation.test'.

expectFrequencyMethod

Character value. Method for calculating the expected probability of pattern occurrence, 'Markov' (default) or 'Naive'.

MarkovOrder

Integer value. The order of Markov chain. Default is 3.

bgdForPerm

Character value. The method for obtaining the background sequence. 'chromosome' (default) selects background chromosome from chromosomes, refer to 'chromosome' parameter; 'shuffle' will obtain the backgroud sequence by shufflubg any k-mers in peak sequences, refer to '...'.

chromosome

Character value. Relevant if "bgdForPerm='chromosome'". 'asPeak' means to use the same chromosomes in peaks; 'random' means to use all chromosomes randomly. Default is 'asPeak'.

nperm

Integer value. The number of permutation test, default is 1000.

alpha

Numeric value. The significant level for permutation test, default is 0.05.

...

Aditional parameter passed to function shuffle_sequences

Details

Please see shuffle_sequences for the more information bout 'shuffle' method.

Value

A list including two data frames named 'motif_enrichment' and 'motif_occurrence'. The 'motif_enrichment' has four columns:

  • "patternNum": number of matched pattern

  • "totalNumPatternWithSameLen": total number of pattern with the same length

  • "expectedRate": expected rate of pattern for 'binom.test' method

  • "patternRate": real rate of pattern for 'permutation.test' method

  • "pValueBinomTest": p value of bimom test for 'binom.test' method

  • "cutOffPermutationTest": cut off of permutation test for 'permutation.test' method

The 'motif_occurrence' has 14 columns:

  • "motifChr": Chromosome of motif

  • "motifStartInChr": motif start position in chromosome

  • "motifEndInChr": motif end position in chromosome

  • "motifName": motif name

  • "motifPattern": motif pattern

  • "motifStartInPeak": motif start position in peak

  • "motifEndInPeak": motif end position in peak

  • "motifFound": specific motif Found in peak

  • "motifFoundStrand": strand of specific motif Found in peak, "-" means reverse complement of motif found in peaks

  • "peakChr": Chromosome of peak

  • "peakStart": peak start position

  • "peakEnd": peak end position

  • "peakWidth": peak width

  • "peakStrand": peak strand

Author(s)

Lihua Julie Zhu, Junhui Li, Kai Hu

Examples

                            
library(BSgenome.Hsapiens.UCSC.hg19)
filepath <- system.file("extdata", "examplePattern.fa", 
                        package = "ChIPpeakAnno")
peaks <- GRanges(seqnames = c("chr17", "chr3", "chr12", "chr8"),
                 IRanges(start = c(41275784, 10076141, 4654135, 31024288),
                         end = c(41276382, 10076732, 4654728, 31024996),
                         names = paste0("peak", 1:4)))
result <- summarizePatternInPeaks(patternFilePath = filepath, peaks = peaks,
                                  BSgenomeName = Hsapiens)


jianhong/ChIPpeakAnno documentation built on Nov. 1, 2024, 8:55 a.m.