process_corecmotifs: Filter a list of CoRecMotifs and match them to reference...

View source: R/00_run_full_analysis.R

process_corecmotifsR Documentation

Filter a list of CoRecMotifs and match them to reference motifs

Description

Removes CoRecMotifs with low z-scores or motif strengths and CoRecMotifs that do not replicate, then compares the remaining CoRecMotifs to a database of reference motifs to identify the best match. This is a convenience function that calls filter_corecmotifs(), check_replicates(), find_match(), then filter_corecmotifs() and check_replicates() again, and finally summarize_corecmotifs() and optionally compare_conditions().

Usage

process_corecmotifs(
  corecmotifs,
  reference_motifs_file,
  cluster_assignments = NULL,
  meme_path = NULL,
  motif_strength = 1,
  rolling_ic = 1,
  n_replicates = 2,
  eucl_distance = 0.4,
  min_overlap = 5,
  match_pvalue = 0.01,
  pbm_condition_groups = NULL,
  output_directory = NULL,
  output_base_name = NULL
)

Arguments

corecmotifs

list. The CoRecMotifs to process.

reference_motifs_file

character(1). The path to the MEME format file of reference motifs to match to.

cluster_assignments

data.frame or NULL. A table mapping the reference motifs to motif clusters or NULL to skip the cluster assignment step. See motif_clusters for expected columns. (Default: NULL)

meme_path

character(1) or NULL. The path to "meme/bin/" or NULL to rely on memes::runTomTom() to find it. (Default: NULL)

motif_strength

numeric(1) or NULL. The minimum motif strength to keep or NULL not to filter by motif strength. (Default: 1)

rolling_ic

numeric(1) or NULL. The minimum rolling IC to keep or NULL not to filter by rolling IC. (Default: 1)

n_replicates

integer(1). The minimum number of replicates to require. Set this to 1 to skip filtering by number of replicates. (Default: 2)

eucl_distance

numeric(1) or NULL. The maximum allowable Euclidean distance between replicate motifs or NULL to skip the replicate comparison step. (Default: 0.4)

min_overlap

integer(1). The minimum amount of overlap to require when comparing a CoRecMotif to a reference motif. (Default: 5)

match_pvalue

numeric(1) or NULL. The maximum match p-value to keep or NULL not to filter by match p-value. (Default: 0.01)

pbm_condition_groups

list(character) or NULL. The names of the PBM conditions to compare. Each element of the list should contain a group of PBM conditions to compare to each other. If the list elements are named, the names will be passed to the group_name parameter of compare_conditions(). (Default: NULL)

output_directory

character(1) or NULL. The path to the directory where output files will be saved. No output files will be created unless output_directory or output_base_name is provided. (Default: NULL)

output_base_name

character(1) or NULL. The base name for output files. No output files will be created unless output_directory or output_base_name is provided. (Default: NULL)

Details

By default no output files are created. To save output files, you must provide output_directory, output_base_name, or both. The output files are named according to the following rules.

  • If output_directory is provided but output_base_name is not, the files will be saved in the provided directory using the base name "output".

  • If output_base_name is provided but output_directory is not, the files will be saved in the working directory using the provided base name.

  • The individual output file names will be identified by a suffix appended to the output base name.

  • The output after the first call to filter_corecmotifs() and check_replicates() will have the suffix "filtered_corecmotifs.rds".

  • The output of find_match() will have the suffix "matched_corecmotifs.rds".

  • The output after the second call to filter_corecmotifs() and check_replicates() will have the suffix "significant_corecmotifs.rds".

  • The output of summarize_corecmotifs() will have the suffix "significant_corecmotifs_summary.tsv".

  • The output of summarize_corecmotifs() with by_cluster = TRUE will have the suffix "significant_corecmotifs_summary_by_cluster.tsv".

  • The output of compare_conditions() will have the suffix "condition_comparisons.tsv".

Value

A filtered list of replicated CoRecMotifs that match a reference motif.

See Also

filter_corecmotifs(), check_replicates(), find_match(), summarize_corecmotifs(), compare_conditions()

Examples

print("FILL THIS IN")

Siggers-Lab/hTF_array documentation built on Feb. 7, 2024, 11:25 p.m.