segmentation: Multi-resolution segmentation

View source: R/segmentation.R

segmentationR Documentation

Multi-resolution segmentation


Compute a multi-resolution segmentation based on locally optimal enrichment scores and internal consistency of the segmentation tree.


segmentation(Yi, name, wmax = 0, wmin = 3, gamma = 1, Tw = 0)



vector of statistical scores produced by the enrichmentScore function.


prefix used to generate output file names.


maximal window size to be scanned in number of Yi values. The default is length(Yi).


minimum size of segmented regions in number of Yi values. The default is 3 consecutive Yi values.


numeric value ranging from ]0;1] and acting as a stringency factor which influences both the minimum size and enrichment score of segmented regions. The segmentation stringency gets higher when gamma value gets lower. Default value is 1.0 (no effect).


numeric value ranging from ]-Inf;0] and indicating the minimal enrichment score accepted for segmented regions. This thresholding becomes more stringent when Tw value gets lower. Default value is 0 (no threshold).


segmentation returns a list object with the following attributes:


computation time for the optimization procedure


computation time for the domain fusion procedure


total number of locally optimal segments after optimization

total number of domains after domain fusion


number of maximum resolution domains


number of maximum scale domains


result file for the optimization procedure

result file for the domain fusion procedure


result file for maximum resolution domains


result file for maximum scale domains


best statistical score (pseudo p-values from combined rank-based scores) for each scanned window size

Output files

  • Common structure

    All output files are tab delimited text files including two header lines.

     line 1 = analysis parameters line 2 = column names following
    lines = data table 
  • file.segments

    This file contains the optimal segment data, resulting from the the optimization procedure, and defined by columns (i, w, Piw) as follows:

     i = index of the last Yi value within the optimal segment w =
    size in number of consecutive Yi values Piw = statistical score 
  •, file.maxresolution, file.maxscale

    These files contain the segmented domain data, resulting from the domain fusion procedure, and defined by columns (id, container, start, end, wmin, wmax, P, i, w) as follows:

       id        = unique identifier for each domain
       container = identifier of parent domain, 0 meaning no parent
       start     = index of the first Yi value within domain
       end       = index of the last Yi value within domain
       wmin      = minimum size of included optimal segments, in number of Yi values
       wmax      = maximum size of included optimal segments, in number of Yi values
       P         = statistical score of the locally optimal segment
       i         = index of the last Yi value within the locally optimal segment
       w         = size of the locally optimal segment in number of consecutive Yi values


Benjamin Leblanc


Leblanc B., Comet I., Bantignies F., and Cavalli G., Chromosome Conformation Capture on Chip (4C): data processing. Book chapter in Polycomb Group Proteins: Methods and Protocols. Lanzuolo C., Bodega B. editors, Methods in Molecular Biology (2016).

See Also

calc.Qi, enrichmentScore, domainogram, plotOptimalSegments, plotDomains


# Simulate enrichment signal
n <- 2000
Mi <- rep(0, n)
Mi <- Mi + dnorm(1:n, 2.5*n/20,  n/40) + dnorm(1:n, 4*n/20,  50)
Mi <- Mi + 4 * dnorm(1:n, 5*n/10, n/10)
Mi <- Mi + dnorm(1:n, 16*n/20,  n/40) + dnorm(1:n, 17.5*n/20,  50)
Mi <- (Mi/max(Mi))^4 + rnorm(n)/4

# Compute enrichment scores
Yi <- enrichmentScore(Mi)

# Multi-resolution segmentation
seg.c <- segmentation(Yi, name="MRA_demo", wmin=20)

# Load segmentation results
opts <- read.delim(seg.c$file.segments, stringsAsFactors=F, skip=1)
doms <- read.delim(seg.c$, stringsAsFactors=F) <- read.delim(seg.c$file.maxresolution, stringsAsFactors=F) <- read.delim(seg.c$file.maxscale, stringsAsFactors=F)

# Visualization coordinates
x.s <- 1:n - 0.5
x.e <- 1:n + 0.5
w2y <- wSize2yAxis(n, logscale=T)

layout(matrix(1:2, 2, 1), heights=c(3,1)/4)
par(mar=c(3, 4, 1, 2)) # default bottom, left, top, right = c(5, 4, 4, 2)

# Plot domainogram
domainogram(Yi, x.s, x.e, w2y)
plot(Mi, type='l', xaxs = 'i')

# Visualize segmentation results
plotOptimalSegments(opts, x.s, x.e, w2y, col="black")
plot(Mi, type='l', xaxs = 'i')

# Visualize multi-resolution domains
plotDomains(doms, x.s, x.e, w2y, col=rgb(0,0,0,0.5), border=rgb(0,0,0,0))
# Visualize max. resolution and max. scale domains
plotDomains(, x.s, x.e, w2y, col=rgb(0,1,0,0.5), border=rgb(0,0,0,0), lwd=2, lty=1, add=T)
plotDomains(, x.s, x.e, w2y, col=rgb(1,0,0,0.5), border=rgb(0,0,0,0), lwd=2, lty=1, add=T)
legend("topright", c("Max. resolution", "Max. scale", "Both"), fill=c("green", "red", "chocolate"), bty='n')
plot(Mi, type='l', xaxs = 'i')

benja0x40/MRA.TA documentation built on March 13, 2023, 5:15 a.m.