wrapPWMEnrich: Transcription factor binding site enrichment

Description Usage Arguments Details Value Author(s) See Also

View source: R/SYB_wrapPWMEnrich.R

Description

This function uses the PWMEnrich-package for TFBS enrichment.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
wrapPWMEnrich(
  sequences,
  newheader = NULL,
  annoColumn = NULL,
  name.organism = "hsapiens",
  projectfolder = "GEX/TFBS",
  projectname = "",
  figure.res = 300,
  applyFilter = FALSE,
  filtercat1 = "adj.P.Val",
  filtercat1.decreasing = FALSE,
  filtercat1.function = abs,
  filtercat1.threshold = 0.05,
  filtercat2 = "logFC",
  filtercat2.decreasing = TRUE,
  filtercat2.function = abs,
  filtercat2.threshold = log2(1.5),
  PromLookup = TRUE,
  id.type = "ENTREZID",
  id.column = "ENTREZID",
  PromSeqUpstreamTSS = 2000,
  PromSeqDownstreamTSS = 200,
  SearchSelMotifs = NULL,
  motif.min.score = 0.9
)

Arguments

sequences

dataframe or character with file path to dataframe or named list containing dataframes. The dataframe must either contain a column with Entrez IDs or sequence coordinates if PromLookup == FALSE. Required columns for latter case are defined in PromLookup below.

newheader

NULL if sequences already supplied with header. Character vector with new header otherwise.

annoColumn

character or vector of characters. Column name(s) of sequences-object with sequence annotation to maintain.

name.organism

currently human data only (hg19).

projectfolder

output directory.

projectname

character prefix for output name.

figure.res

numeric resolution for png.

applyFilter

(boolean) If TRUE, sequences are filtered for applied categories and thresholds. Filter Values converted to ABSOLUTE values. Optional Filtering criteria (Ignored if applyFilter=FALSE):

filtercat1

column name of first category to filter sequences (e.g. p-values).

filtercat1.decreasing

(boolean) direction to order filtercat1.

filtercat1.function

select transforming function for filter category1 (no quotes). e.g. abs for absolute values, identity for no transformation

filtercat1.threshold

Threshold for filtercat1 or 'top123' for top Hits

filtercat2

column name of second category to filter sequences (e.g. effect size).

filtercat2.decreasing

(boolean) direction to order filtercat2.

filtercat2.function

select transforming function for filter category2 (no quotes). E.g. abs for foldchanges

filtercat2.threshold

Threshold for filtercat2 or 'top123' for top Hits

PromLookup

(boolean) if TRUE, all promotor sequences corresponding to genes in sequences are downloaded. Therefore a column with EntrezIDs is requried (column name given in Entrez.col). if FALSE, Sequences are downloaded according to given coordinates in sequences. Therefore columns for chromosome, start, stop and strand information are required! Additional meta columns allowed.

id.type

character with identifier type from annotation package ("ENTREZID" or "SYMBOL") Gene symbols Will be converted to EntrezIDs prior to enrichment analysis.

id.column

character with column name for identifier variable in sequences.

PromSeqUpstreamTSS

definition of promotor regions to download upstream to TSS.

PromSeqDownstreamTSS

definition of promotor regions to download downstream to TSS.

SearchSelMotifs

Character Vector of selected motives to search in sequences. Omitted if NULL.

motif.min.score

minimum score to match motif pwm to target sequence (ignored if SearchSelMotifs = NULL).

Details

The function takes input genes in sequences and looks up all promotor sequences by (unique) entrezIDs refering to human genome build hg19. If PromLookup == FALSE, look up of promotor sequences is omitted and sequences of interest must be given in sequences as coordinates instead. The dataset may be filtered for designated filter criteria if desired. These sequences are transferred to motifEnrichment from PWMEnrich-package to identify enriched Transcription factor binding motivs for the input sequences. Optionally, preselected motivs given in SearchSelMotifs are looked up in sequences. All result tables and plots are stored in the project folder.

Value

groupReport of motifEnrichment results.

Author(s)

Frank Ruehle

See Also

MotifDb, PWMEnrich, PWMEnrich.Hsapiens.background


frankRuehle/systemsbio documentation built on Sept. 14, 2020, 1:18 a.m.