Description Usage Arguments Details Value References Examples
Fit the Romulus model.
| 1 2 3 | fitRomulus(cuts1, cuts2, anno, priors, bins1, bins2, nbound = NA,
  PriorLik = NULL, addIntercept = T, mixingDelta = 0.5, maxIter = 100,
  maxPostProbDiff = 0.001)
 | 
| cuts1 | integer matrix of forward strand DNase I cuts, with a row for each candidate binding site. The columns should correspond to the genomic locations upstream and within the candidate binding site. | 
| cuts2 | integer matrix of reverse strand DNase I cuts, with a row for each candidate binding site. The columns should correspond to the genomic locations within the candidate binding site and downstream. | 
| anno | data frame with annotations of the candidate binding sites. The numeric columns to be used are specified in  | 
| priors | character vector or list of character vectors specifying the names of columns from  | 
| bins1 | integer vector or matrix specifying how the columns of  | 
| bins2 | integer vector or matrix specifying how the columns of  | 
| nbound | optional integer, specifying the number of bound states. If not provided, the value will be guessed from the length of  | 
| PriorLik | optional numeric matrix, with a row for each candidate binding site and a column for each bound state, containing the initial prior likelihoods. If not provided, the default initialization procedure will be used. | 
| addIntercept | logical. Should an additional intercept term be included in the model? | 
| maxIter | integer. Maximal number of Expectation-Maximization iterations to perform. | 
| maxPostProbDiff | numeric. The Expectation-Maximization procedure will be terminated when the absolute differences in posterior probabilities between the iterations will become smaller than this value. | 
Fit the Romulus model using an Expectation-Maximization approach.
A list with the elements
| nstates | number of states (including the unbound state). | 
| nbound | number of bound states. | 
| bins1 | matrix specifying how the columns of  | 
| bins2 | matrix specifying how the columns of  | 
| binsizes1 | list specifying, for each state, how many columns of  | 
| binsizes2 | list specifying, for each state, how many columns of  | 
| Beta | list of estimated logistic regression coefficients β_j^{(k)} for each state. | 
| NegBinom | matrix of estimated negative binomial parameters for each state, with columns  | 
| Lambda1 | list of estimated multinomial parameters λ_b^{+(k)} for each state. | 
| Lambda2 | list of estimated multinomial parameters λ_b^{-(k)} for each state. | 
| LambdaReg1 | matrix of multinomial parameter estimates for forward strand DNase I cuts, smoothed by replacing the fixed-value bins by a piecewise linear function, with a row for each state. | 
| LambdaReg2 | matrix of multinomial parameter estimates for reverse strand DNase I cuts, as above. | 
| PriorProb | matrix of prior probabilities calculated from the logistic regression component, with a row for each candidate binding site and a column for each state. | 
| LogLikelihood | matrix of log-likelihoods calculated from the negative binomial and multinomial components, with a row for each candidate binding site and a column for each state. | 
| PostProb | matrix of posterior probabilities, calculated from the complete model, with a row for each candidate binding site and a column for each state. | 
Jankowski, A., Tiuryn, J. and Prabhakar, S. (2016). Romulus: robust multi-state identification of transcription factor binding sites from DNase-seq data. Bioinformatics 32, 2419–2426. doi: 10.1093/bioinformatics/btw209.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | # Clip the DNase-seq data for NRSF at 99.9% quantile
thresh <- max(quantile(NRSF.cuts, 0.999), 1L)
cuts <- pmin(NRSF.cuts, thresh)
# Forward strand cuts only upstream and within the candidate binding site
cuts1 <- cuts[, 1:(ncol(cuts) / 2 - NRSF.margin)]
# Reverse strand cuts only within the candidate binding site and downstream
cuts2 <- cuts[, ncol(cuts) / 2 + (NRSF.margin + 1):(ncol(cuts) / 2)]
# Take 20 bp bins outside the candidate binding site and 1 bp bins within it
NRSF.width <- (ncol(cuts) - 4 * NRSF.margin) / 2
bins1 <- c(rep(1:10, each = 20), 10 + 1:NRSF.width)
bins2 <- c(1:NRSF.width, NRSF.width + rep(1:10, each = 20))
# Fit the Romulus model
r.fit <- fitRomulus(cuts1, cuts2, NRSF.anno, list(c("score")), bins1, bins2)
# Benchmarking of Romulus and CENTIPEDE
## Not run: 
library(ROCR)
library(CENTIPEDE)
c.fit <- fitCentipede(Xlist = list(DNase = as.matrix(NRSF.cuts)),
  Y = cbind(1, NRSF.anno$score))
r.pred <- prediction(1 - r.fit$PostProb[, r.fit$nstates],
  as.integer(NRSF.anno$signalValue > 0))
r.perf <- performance(r.pred, measure = "tpr", x.measure = "fpr")
r.auc <- performance(r.pred, measure = "auc")
c.pred <- prediction(c.fit$PostPr, as.integer(NRSF.anno$signalValue > 0))
c.perf <- performance(c.pred, measure = "tpr", x.measure = "fpr")
c.auc <- performance(c.pred, measure = "auc")
plot(r.perf, col = "red",
  main = "NRSF binding predictions benchmarked using ChIP-seq data")
lines(c.perf@x.values[[1]], c.perf@y.values[[1]], col = "blue")
legend("bottomright", col = c("red", "blue"), lty = 1,
  legend = c(sprintf("Romulus, AUC = %0.4f", r.auc@y.values[[1]]),
  sprintf("CENTIPEDE, AUC = %0.4f", c.auc@y.values[[1]])))
## End(Not run)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.