runTomTom: Run TomTom on target motifs

View source: R/tomtom.R

runTomTomR Documentation

Run TomTom on target motifs

Description

TomTom compares input motifs to a database of known, user-provided motifs to identify matches.

Usage

runTomTom(
  input,
  database = NULL,
  outdir = "auto",
  thresh = 10,
  min_overlap = 5,
  dist = "ed",
  evalue = TRUE,
  silent = TRUE,
  meme_path = NULL,
  ...
)

Arguments

input

path to .meme format file of motifs, a list of universalmotifs, or a universalmotif data.frame object (such as the output of runDreme())

database

path to .meme format file to use as reference database (or list of universalmotifs). NOTE: p-value estimates are inaccurate when the database has fewer than 50 entries.

outdir

directory to store tomtom results (will be overwritten if exists). Default: location of input fasta file, or temporary location if using universalmotif input.

thresh

report matches less than or equal to this value. If evalue = TRUE (default), set an e-value threshold (default = 10). If evalue = FALSE, set a value between 0-1 (default = 0.5).

min_overlap

only report matches that overlap by this value or more, unless input motif is shorter, in which case the shorter length is used as the minimum value

dist

distance metric. Valid arguments: allr | ed | kullback | pearson | sandelin | blic1 | blic5 | llr1 | llr5. Default: ed (euclidean distance).

evalue

whether to use E-value as significance threshold (default: TRUE). If evalue = FALSE, uses q-value instead.

silent

suppress printing stderr to console (default: TRUE).

meme_path

path to "meme/bin/" (optional). If unset, will check R environment variable "MEME_DB (set in .Renviron), or option "meme_db" (set with option(meme_db = "path/to/meme/bin"))

...

additional flags passed to tomtom using cmdfun formating (see table below for details)

Details

runTomTom will rank matches by significance and return a best match motif for each input (whose properties are stored in the ⁠best_match_*⁠ columns) as well as a ranked list of all possible matches stored in the tomtom list column.

Additional arguments

runTomTom() can accept all valid tomtom arguments passed to ... as described in the tomtom commandline reference. For convenience, below is a table of valid arguments, their default values, and their description.

TomTom Flag allowed values default description
bfile file path NULL path to background model for converting frequency matrix to log-odds score (not used when dist is set to "ed", "kullback", "pearson", or "sandelin"
motif_pseudo numeric 0.1 pseudocount to add to motifs
xalph logical FALSE convert alphabet of target database to alphabet of query database
norc logical FALSE Do not score reverse complements of motifs
incomplete_scores logical FALSE Compute scores using only aligned columns
thresh numeric 0.5 only report matches with significance values <= this value. Unless evalue = TRUE, this value must be < 1.
internal logical FALSE forces the shorter motif to be completely contained in the longer motif
min_overlap integer 1 only report matches that overlap by this number of positions or more. If query motif is smaller than this value, its width is used as the min overlap for that query
time integer NULL Maximum runtime in CPU seconds (default: no limit)

Value

data.frame of match results. Contains best_match_motif column of universalmotif objects with the matched PWM from the database, a series of ⁠best_match_*⁠ columns describing the TomTom results of the match, and a tomtom list column storing the ranked list of possible matches to each motif. If a universalmotif data.frame is used as input, these columns are appended to the data.frame. If no matches are returned, tomtom and best_match_motif columns will be set to NA and a message indicating this will print.

Citation

If you use runTomTom() in your analysis, please cite:

Shobhit Gupta, JA Stamatoyannopolous, Timothy Bailey and William Stafford Noble, "Quantifying similarity between motifs", Genome Biology, 8(2):R24, 2007. full text

Licensing

The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.

Examples

if (meme_is_installed()) {
motif <- universalmotif::create_motif("CCRAAAW")
database <- system.file("extdata", "flyFactorSurvey_cleaned.meme", package = "memes")

runTomTom(motif, database)
}

snystrom/dremeR documentation built on Oct. 13, 2024, 10:48 p.m.