call_homer: Motif Enrichment with Homer
In MalteThodberg/homeR: Run Homer from inside R

Description Usage Arguments Details Value Author(s) See Also

Call the findMotifsGenome.pl script from Homer directly from R.

call_homer(pos_file, genome, output_dir = tempdir(), mask = NULL,
  bg = NULL, chopify = NULL, len = NULL, size = NULL, S = NULL,
  mis = NULL, norevopp = NULL, rna = NULL, mset = NULL, bits = NULL,
  mcheck = NULL, mknown = NULL, gc = NULL, cpg = NULL,
  noweight = NULL, h = NULL, N = NULL, local = NULL, redundant = NULL,
  maxN = NULL, maskMotif = NULL, rand = NULL, ref = NULL,
  oligo = NULL, dumpFasta = NULL, preparse = NULL, preparsedDir = NULL,
  keepFiles = NULL, fdr = NULL, nlen = NULL, nmax = NULL,
  neutral = NULL, olen = NULL, p = NULL, e = NULL, cache = NULL,
  quickMask = NULL, minlp = NULL)

`pos_file`	<#> (Genomic Ranges object)
`genome`	<#> (Installed Homer genome) or (path to FASTA)
`output_dir`	Path to output dir for Homer analysis. Defaults to tempdir()
`mask`	(mask repeats/lower case sequence, can also add 'r' to genome, i.e. mm9r)
`bg`	<background position file> (genomic positions to be used as background, default=automatic) removes background positions overlapping with target positions
`chopify`	(chop up large background regions to the avg size of target regions)
`len`	<#>[,<#>,<#>...] (motif length, default=8,10,12) [NOTE: values greater 12 may cause the programto run out of memory - in these cases decrease the number of sequences analyzed (-N), or try analyzing shorter sequence regions (i.e. -size 100)]
`size`	<#> (fragment size to use for motif finding, default=200) or (i.e. -size -100,50 will get sequences from -100 to +50 relative from center) or given (uses the exact regions you give it)
`S`	<#> (Number of motifs to optimize, default: 25)
`mis`	<#> (global optimization: searches for strings with # mismatches, default: 2)
`norevopp`	(don't search reverse strand for motifs)
`rna`	(output RNA motif logos and compare to RNA motif database, automatically sets -norevopp)
`mset`	<vertebrates\|insects\|worms\|plants\|yeast\|all> (check against motif collects, default: auto)
`bits`	(scale sequence logos by information content, default: doesn't scale)
`mcheck`	<motif file> (known motifs to check against de novo motifs)
`mknown`	<motif file> (known motifs to check for enrichment)
`gc`	(use GC-percentage for sequence content normalization, now the default)
`cpg`	(use CpG-percentage instead of GC-percentage for sequence content normalization)
`noweight`	(no CG correction)
`h`	(use hypergeometric for p-values, binomial is default)
`N`	<#> (Number of sequences to use for motif finding, default=max(50k, 2x input)
`local`	<#> (use local background, # of equal size regions around peaks to use i.e. 2)
`redundant`	<#> (Remove redundant sequences matching greater than # percent, i.e. -redundant 0.5)
`maxN`	<#> (maximum percentage of N's in sequence to consider for motif finding, default: 0.7)
`maskMotif`	<motif file1> [motif file 2]... (motifs to mask before motif finding)
`rand`	(randomize target and background sequences labels)
`ref`	<peak file> (use file for target and background - first argument is list of peak ids for targets)
`oligo`	(perform analysis of individual oligo enrichment)
`dumpFasta`	(Dump fasta files for target and background sequences for use with other programs)
`preparse`	(force new background files to be created)
`preparsedDir`	<directory> (location to search for preparsed file and/or place new files)
`keepFiles`	(keep temporary files)
`fdr`	<#> (Calculate empirical FDR for de novo discovery #=number of randomizations)
`nlen`	<#> (length of lower-order oligos to normalize in background, default: -nlen 3)
`nmax`	<#> (Max normalization iterations, default: 160)
`neutral`	(weight sequences to neutral frequencies, i.e. 25-percentage, 6.25-percentage, etc.)
`olen`	<#> (lower-order oligo normalization for oligo table, use if -nlen isn't working well)
`p`	<#> (Number of processors to use, default: 1)
`e`	<#> (Maximum expected motif instance per bp in random sequence, default: 0.01)
`cache`	<#> (size in MB for statistics cache, default: 500)
`quickMask`	(skip full masking after finding motifs, similar to original homer)
`minlp`	<#> (stop looking for motifs when seed logp score gets above #, default: -10)

Simple R-wrapper for Homer's findMotifsGenome.pl. Instead of flags, it uses R-arguments which are pasted to a Homer command. Flags that modify output format are not implemented: -nomotif, -find, -enhancers, -enhancersOnly, -basic, -nocheck, -noknown, -nofacts, -opt, -peaks, -homer2.

Saves all temporary files to output_dir. Note these files are only deleted upon closing the R-session, which can in some cases lead to files from previous runs being reloaded.

List with output: command line used, knowm motifs, Homer motifs (de-novo) and Homer PWMs.

Malte Thodberg

GR_to_BED tempdir

MalteThodberg/homeR documentation built on May 7, 2019, 2:09 p.m.