getTFBSdata | R Documentation |
getTFBSdata
writes files encoding for datasets characterizing the genomic context
around motif occurences along the genome for the considered transcription factors
(training and/or studied TFs).
getTFBSdata(
pfm = NULL,
TFnames = NULL,
organism = NULL,
genome_sequence = "getGenome",
imported_genomic_data,
matches = NULL,
strand_as_feature = FALSE,
pval_threshold = 0.001,
short_window = 20,
medium_window = 400,
long_window = 1000
)
pfm |
Path to a file including the position frequency or weight matrices (PFMs or PWMs) of the motifs recognized
by the considered transcription factors (training and/or studied TFs). This file can be in different formats, determined based
on the file extension: raw pfm (".pfm"), jaspar (".jaspar"), meme (".meme"), transfac (".transfac"), homer (".motif") or
cis-bp (".txt"). |
TFnames |
names of the considered transcription factors among those described in the |
organism |
Binomial name of the organism. Can be set to |
genome_sequence |
"getGenome" (by default) or local path to a FASTA file encoding the genomic sequence
of the organism. The default value allows the automatic download of the genomic sequence (when |
imported_genomic_data |
An object output by |
matches |
|
strand_as_feature |
A logical. Should be considered as feature the orientation of the matches in relation to the
direction of transcription of the closest transcript? Default is |
pval_threshold |
P-value threshold to identify the matches with the primary motif of the transcription factors. Default is set to 0.001. |
short_window |
An integer (20 by default). Sets the length of the short-ranges window centered on the potential binding sites and on which the genomic features are extracted. |
medium_window |
An integer (400 by default). Sets the length of the medium-ranges window centered on the potential binding sites and on which the genomic features are extracted. |
long_window |
An integer (1000 by default). Sets the length of the long-ranges window centered on the potential binding sites and on which the genomic features are extracted. |
A vector indicating the local paths to the tab-delimited files in which are written the results of pattern-matching and genomic feature extraction for each of the transcription factors considered. The 5 first fields of these files describe the location of the potential binding sites identified by pattern-matching. The following fields contain the raw score and/or p-value of the matches, and the the genomic features extracted at location of the matches on short-, medium- and long-ranges-centered windows the label ('1' = "positive" = "ChIP-validated in the considered condition" or '0' = "negative") for the the training TFsr.
importGenomicData()
for importing genomic data and buildTFBSmodel()
to train a predictive model of
transcription factor binding sites.
genomic_data.ex <- c(CE = system.file("extdata/conserved_elements_example.bed", package = "Wimtrap"),
DGF = system.file("extdata/DGF_example.bed", package = "Wimtrap"),
DHS = system.file("extdata/DHS_example.bed", package = "Wimtrap"),
X5UTR = system.file("extdata/x5utr_example.bed", package = "Wimtrap"),
CDS = system.file("extdata/cds_example.bed", package = "Wimtrap"),
Intron = system.file("extdata/intron_example.bed", package = "Wimtrap"),
X3UTR = system.file("extdata/x3utr_example.bed", package = "Wimtrap")
)
imported_genomic_data.ex <- importGenomicData(biomart = FALSE,
genomic_data = genomic_data.ex,
tss = system.file("extdata/tss_example.bed", package = "Wimtrap"),
tts = system.file("extdata/tts_example.bed", package = "Wimtrap"))
TFBSdata.ex <- getTFBSdata(pfm = system.file("extdata/pfm_example.pfm", package = "Wimtrap"),
TFnames = c("PIF3", "TOC1"),
organism = NULL,
genome_sequence = system.file("extdata/genome_example.fa", package = "Wimtrap"),
imported_genomic_data = imported_genomic_data.ex)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.