calculate_motif_enrichment: Binding Site Enrichment Value Calculation
In transite: RNA-binding protein motif analysis

Description Usage Arguments Value See Also Examples

This function is used to calculate binding site enrichment / depletion scores between predefined foreground and background sequence sets. Significance levels of enrichment values are obtained by Monte Carlo tests.

calculate_motif_enrichment(
  foreground_scores_df,
  background_scores_df,
  background_total_sites,
  background_absolute_hits,
  n_transcripts_foreground,
  max_fg_permutations = 1e+06,
  min_fg_permutations = 1000,
  e = 5,
  p_adjust_method = "BH"
)

`foreground_scores_df`	result of `score_transcripts` on foreground sequence set (foreground sequence sets must be a subset of the background sequence set)
`background_scores_df`	result of `score_transcripts` on background sequence set
`background_total_sites`	number of potential binding sites per sequence (returned by `score_transcripts`)
`background_absolute_hits`	number of putative binding sites per sequence (returned by `score_transcripts`)
`n_transcripts_foreground`	number of sequences in the foreground set
`max_fg_permutations`	maximum number of foreground permutations performed in Monte Carlo test for enrichment score
`min_fg_permutations`	minimum number of foreground permutations performed in Monte Carlo test for enrichment score
`e`	integer-valued stop criterion for enrichment score Monte Carlo test: aborting permutation process after observing `e` random enrichment values with more extreme values than the actual enrichment value
`p_adjust_method`	adjustment of p-values from Monte Carlo tests to avoid alpha error accumulation, see `p.adjust`

A data frame with the following columns:

`motif_id`	the motif identifier that is used in the original motif library
`motif_rbps`	the gene symbol of the RNA-binding protein(s)
`enrichment`	binding site enrichment between foreground and background sequences
`p_value`	unadjusted p-value from Monte Carlo test
`p_value_n`	number of Monte Carlo test permutations
`adj_p_value`	adjusted p-value from Monte Carlo test (usually FDR)

Other matrix functions: run_matrix_spma(), run_matrix_tsma(), score_transcripts_single_motif(), score_transcripts()

foreground_seqs <- c("CAGUCAAGACUCC", "AAUUGGUGUCUGGAUACUUCCCUGUACAU",
  "AGAU", "CCAGUAA")
background_seqs <- c(foreground_seqs, "CAACAGCCUUAAUU", "CUUUGGGGAAU",
                     "UCAUUUUAUUAAA", "AUCAAAUUA", "GACACUUAAAGAUCCU",
                     "UAGCAUUAACUUAAUG", "AUGGA", "GAAGAGUGCUCA",
                     "AUAGAC", "AGUUC")
foreground_scores <- score_transcripts(foreground_seqs, cache = FALSE)
background_scores <- score_transcripts(background_seqs, cache = FALSE)
enrichments_df <- calculate_motif_enrichment(foreground_scores$df,
  background_scores$df,
  background_scores$total_sites, background_scores$absolute_hits,
  length(foreground_seqs),
  max_fg_permutations = 1000
)