buildTFBSmodel | R Documentation |
buildTFBSmodel
learns (and optionally validates) a predictive model of transcription factor(TF) binding sites based
on the integration of genomic features extracted at the location of potential binding
sites identified by a prior analysis (such as pattern-matching). This function implements
an algorithm of supervised machine learning, the extreme gradient boosting (XGBoost), which
will be fed by a dataset of potential binding sites of training TFs either labelled as 'positive' or 'negative'
(i.e. ChIP-seq 'validated' or 'not-validated' in a given condition).
buildTFBSmodel(
TFBSdata,
ChIPpeaks,
ChIPpeaks_length = 400,
TFs_validation = NULL,
model_assessment = TRUE,
xgb_modeling = TRUE,
balancing_only = FALSE
)
TFBSdata |
A named character vector as output by the |
ChIPpeaks |
A named character vector defining the local paths to BED files encoding the
location of ChIP-peaks. The vector is named according to the training transcription factors
that are described by the files indicated. Caution: the names of the |
ChIPpeaks_length |
An integer setting a fixed length for the ChIP-peaks, that are defined as the intervals of
|
TFs_validation |
|
model_assessment |
A logical. If |
xgb_modeling |
A logical. If |
balancing_only |
If |
An object of xgb.Booster
class if xgb_modeling = TRUE
. A list of two data.table
corresponding
each to the training and validation datasets otherwise.
getTFBSdata()
for obtaining the datasets and predictTFBS()
to predict transcription factor
binding site location
genomic_data.ex <- c(CE = system.file("extdata/conserved_elements_example.bed", package = "Wimtrap"),
DGF = system.file("extdata/DGF_example.bed", package = "Wimtrap"),
DHS = system.file("extdata/DHS_example.bed", package = "Wimtrap"),
X5UTR = system.file("extdata/x5utr_example.bed", package = "Wimtrap"),
CDS = system.file("extdata/cds_example.bed", package = "Wimtrap"),
Intron = system.file("extdata/intron_example.bed", package = "Wimtrap"),
X3UTR = system.file("extdata/x3utr_example.bed", package = "Wimtrap")
)
imported_genomic_data.ex <- importGenomicData(biomart = FALSE,
genomic_data = genomic_data.ex,
tss = system.file("extdata/tss_example.bed", package = "Wimtrap"),
tts = system.file("extdata/tts_example.bed", package = "Wimtrap"))
TFBSdata.ex <- getTFBSdata(pfm = system.file("extdata/pfm_example.pfm", package = "Wimtrap"),
TFnames = c("PIF3", "TOC1"),
organism = NULL,
genome_sequence = system.file("extdata/genome_example.fa", package = "Wimtrap"),
imported_genomic_data = imported_genomic_data.ex)
TFBSmodel.ex <- buildTFBSmodel(TFBSdata = TFBSdata.ex,
ChIPpeaks = c(PIF3 = system.file("extdata/PIF3_example.bed", package = "Wimtrap"),
TOC1 = system.file("extdata/TOC1_example.bed", package = "Wimtrap")),
TFs_validation = "PIF3")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.