Description Usage Arguments Details Value Author(s) See Also
Call the findMotifsGenome.pl script from Homer directly from R.
1 2 3 4 5 6 7 8 9 10 | call_homer(pos_file, genome, output_dir = tempdir(), mask = NULL,
bg = NULL, chopify = NULL, len = NULL, size = NULL, S = NULL,
mis = NULL, norevopp = NULL, rna = NULL, mset = NULL, bits = NULL,
mcheck = NULL, mknown = NULL, gc = NULL, cpg = NULL,
noweight = NULL, h = NULL, N = NULL, local = NULL, redundant = NULL,
maxN = NULL, maskMotif = NULL, rand = NULL, ref = NULL,
oligo = NULL, dumpFasta = NULL, preparse = NULL, preparsedDir = NULL,
keepFiles = NULL, fdr = NULL, nlen = NULL, nmax = NULL,
neutral = NULL, olen = NULL, p = NULL, e = NULL, cache = NULL,
quickMask = NULL, minlp = NULL)
|
pos_file |
<#> (Genomic Ranges object) |
genome |
<#> (Installed Homer genome) or (path to FASTA) |
output_dir |
Path to output dir for Homer analysis. Defaults to tempdir() |
mask |
(mask repeats/lower case sequence, can also add 'r' to genome, i.e. mm9r) |
bg |
<background position file> (genomic positions to be used as background, default=automatic) removes background positions overlapping with target positions |
chopify |
(chop up large background regions to the avg size of target regions) |
len |
<#>[,<#>,<#>...] (motif length, default=8,10,12) [NOTE: values greater 12 may cause the programto run out of memory - in these cases decrease the number of sequences analyzed (-N), or try analyzing shorter sequence regions (i.e. -size 100)] |
size |
<#> (fragment size to use for motif finding, default=200) or (i.e. -size -100,50 will get sequences from -100 to +50 relative from center) or given (uses the exact regions you give it) |
S |
<#> (Number of motifs to optimize, default: 25) |
mis |
<#> (global optimization: searches for strings with # mismatches, default: 2) |
norevopp |
(don't search reverse strand for motifs) |
rna |
(output RNA motif logos and compare to RNA motif database, automatically sets -norevopp) |
mset |
<vertebrates|insects|worms|plants|yeast|all> (check against motif collects, default: auto) |
bits |
(scale sequence logos by information content, default: doesn't scale) |
mcheck |
<motif file> (known motifs to check against de novo motifs) |
mknown |
<motif file> (known motifs to check for enrichment) |
gc |
(use GC-percentage for sequence content normalization, now the default) |
cpg |
(use CpG-percentage instead of GC-percentage for sequence content normalization) |
noweight |
(no CG correction) |
h |
(use hypergeometric for p-values, binomial is default) |
N |
<#> (Number of sequences to use for motif finding, default=max(50k, 2x input) |
local |
<#> (use local background, # of equal size regions around peaks to use i.e. 2) |
redundant |
<#> (Remove redundant sequences matching greater than # percent, i.e. -redundant 0.5) |
maxN |
<#> (maximum percentage of N's in sequence to consider for motif finding, default: 0.7) |
maskMotif |
<motif file1> [motif file 2]... (motifs to mask before motif finding) |
rand |
(randomize target and background sequences labels) |
ref |
<peak file> (use file for target and background - first argument is list of peak ids for targets) |
oligo |
(perform analysis of individual oligo enrichment) |
dumpFasta |
(Dump fasta files for target and background sequences for use with other programs) |
preparse |
(force new background files to be created) |
preparsedDir |
<directory> (location to search for preparsed file and/or place new files) |
keepFiles |
(keep temporary files) |
fdr |
<#> (Calculate empirical FDR for de novo discovery #=number of randomizations) |
nlen |
<#> (length of lower-order oligos to normalize in background, default: -nlen 3) |
nmax |
<#> (Max normalization iterations, default: 160) |
neutral |
(weight sequences to neutral frequencies, i.e. 25-percentage, 6.25-percentage, etc.) |
olen |
<#> (lower-order oligo normalization for oligo table, use if -nlen isn't working well) |
p |
<#> (Number of processors to use, default: 1) |
e |
<#> (Maximum expected motif instance per bp in random sequence, default: 0.01) |
cache |
<#> (size in MB for statistics cache, default: 500) |
quickMask |
(skip full masking after finding motifs, similar to original homer) |
minlp |
<#> (stop looking for motifs when seed logp score gets above #, default: -10) |
Simple R-wrapper for Homer's findMotifsGenome.pl. Instead of flags, it uses R-arguments which are pasted to a Homer command. Flags that modify output format are not implemented: -nomotif, -find, -enhancers, -enhancersOnly, -basic, -nocheck, -noknown, -nofacts, -opt, -peaks, -homer2.
Saves all temporary files to output_dir. Note these files are only deleted upon closing the R-session, which can in some cases lead to files from previous runs being reloaded.
List with output: command line used, knowm motifs, Homer motifs (de-novo) and Homer PWMs.
Malte Thodberg
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.