dmpDensity: Linear density of DMPs at a given genomic region
In genomaths/MethylIT.utils: MethylIT utility

dmpDensity

R Documentation

Linear density of DMPs at a given genomic region

Description

The linear density of DMPs in a given genomic region (GR) is defined according with the classical terminology in physics, i.e., as the measure of the physical quantity of any characteristic value per unit of length. In the current case, as the amount of DIMPs per nucleotide base.

Usage

dmpDensity(
  GR,
  column = 1,
  cut.col = 1,
  cutoff,
  Chr = NULL,
  start.pos = NULL,
  end.pos = NULL,
  int.size1 = NULL,
  int.size2 = NULL,
  breaks = NULL,
  scaling = TRUE,
  plot = FALSE,
  noDMP.dens = TRUE,
  xlabel = "Coordinate",
  ylabel = "Normalized density",
  col.dmp = "red",
  col.ndmp = "blue",
  yintercept = 0.25,
  col.yintercept = "magenta",
  type.yintercept = "dashed",
  dig.lab = 3
)

Arguments

`GR`	A genomic GRanges object carrying the genomic region where the estimation of the DMP density will be accomplished.
`cut.col`	Integer denoting the GR metacolumn where the decision variable about whether a position is DMP is located. Default cut.col = 1.
`cutoff`	Cut value to decide wheter the value of the variable used to estimate the density is a DMP at each position. If missing, then cutoff is estimated as the first queantile greater than zero from the values given in the GR column cut.col.
`Chr`	A character string. Default NULL. If the GR object comprises several chromosomes, then one chromosome must be specified. Otherwise the density of first chromosome will be returned.
`start.pos, end.pos`	Start and end positions, respectively, of the GR where the density of DMPs will be estimated. Default NULL. If NULL densities will be estimated for the whole GR and the specified chromosome.
`int.size1, int.size2`	The interval/window size where the density of DMP and no DMPs are computed. Default Null.
`breaks`	Integer. Number of windows/intervals to split the GR. Deafult NULL. If provided, then it is applied to compute the densities of DMPs and no-DMPs. If 'int.size1', 'int.size2', and 'breaks' are NULL, then the breaks are computed as: `breaks <- min(150, max(start(x))/nclass.FD(start(x)), na.rm = TRUE)`, where function nclass.FD (`nclass`) applies Freedman-Diaconis algorithm.
`scaling`	Logic value to deside whether to perform the scaling of the estimated density values or not. Default is TRUE.
`plot`	Logic. Whether to produce a grahic or not. Default, plot = TRUE.
`noDMP.dens`	Logic whether to produce the graphics for no-DMP density. Default is TRUE
`xlabel`	X-axis label. Default xlabel = 'Coordinate'.
`ylabel`	Y-axis label. Default ylabel = 'Normalized density'.
`col.dmp`	Color for the density of DMPs in the graphic.
`col.ndmp`	Color for the density of no DMPs in the graphic.
`yintercept`	If plot == TRUE, this is the position for an horizantal line that intercept the y-axis. Default yintercept = 0.25.
`col.yintercept`	Color for the horizantal line 'yintercept'. Default col.yintercept = 'blue'
`type.yintercept`	Line type for the horizantal line 'yintercept'. Default type.yintercept = 'dashed'.
`dig.lab`	integer which is used when labels are not given. It determines the number of digits used in formatting the break numbers.

Details

Since the number of DIMPs along the DNA sequence vary, the local density of DMPs \rho_i at a fixed interval \Delta l_i is defined by the quotient \rho_i = \Delta DMP_i/\Delta l_i is the amount of DIMPs at the fixed interval. Likewise the local density of non-DIMPs is defined as \rho_i = \Delta nonDMP_i/\Delta l_i. Notice that for a specified methylation context, e.g., CG, \Delta CG_i - \Delta DMP_i, where \Delta CG is the amount CG positions at the given interval. The linear densities are normalized as \rho_i/\rho_max, where \rho_max is the maximum of linear density found in a given GR.

Value

If plot is TRUE will return a graphic with the densities of DMPs and and no DMPs. If plot is FALSE a data frame object with the density of DMPs and not DMPs will be returned.

Author(s)

Robersy Sanchez

Examples

set.seed(349)
## An auxiliary function to generate simulated hypothetical values from a
## variable with normal distribution
hypDT <- function(mean, sd, n, num.pos, noise) {
    h <- hist(rnorm(n, mean = mean, sd = sd), breaks = num.pos, plot = FALSE)
    hyp <- h$density * 60 + runif(length(h$density)) * noise
    return(hyp)
}

## To generate a matrix of values with variations introduced by noise
hyp <- hypDT(mean = 5, sd = 30, n = 10^5, noise = 4, num.pos = 8000)
## A GRanges object is built, which will carries the previous matrix on its
## meta-columns
l <- length(hyp)
starts <- seq(0, 30000, 3)[1:l]
ends <- starts
GR <- GRanges(seqnames = 'chr1', ranges = IRanges(start = starts,
                end = ends))
mcols(GR) <- data.frame(signal = hyp)

# If plot is TRUE a grphic is printed. Otherwise data frame is returned.
p <- dmpDensity(GR, plot = FALSE)

# If ggplot2 package is installed, then graphic can customized using
# the returned data frame 'p':

# library(ggplot2)
## Auxiliar function to write scientific notation in the graphics
# fancy_scientific <- function(l) {
#   #'turn in to character string in scientific notation
#   l <- format( l, scientific = TRUE, digits = 1 )
#   l <- gsub('0e\+00','0',l)
#   #'quote the part before the exponent to keep all the digits
#   l <- gsub('^(.*)e', ''\1'e', l)
#   #'turn the 'e+' into plotmath format
#   l <- gsub('e', '%*%10^', l)
#   l <- gsub('[+]', '', l )
#   #'return this as an expression
#   parse(text=l)
# }
#
# max.pos = max(p$DMP.coordinate)
# ggplot(data=p) +
#   geom_line(aes(x=DMP.coordinate, y=DMP.density), color='red') +
#   geom_hline(aes(yintercept=0.25), linetype='dashed',
#              colour='blue', show.legend=FALSE ) +
#   geom_line(aes(x=coordinate, y=density), color='blue') +
#   xlab('Coordinate') + ylab('Normalized density') +
#   scale_y_continuous(breaks=c(0.00, 0.25, 0.50, 0.75, 1.00)) +
#   scale_x_continuous(breaks=c(0.00, 0.25 *max.pos, 0.50*max.pos,
#                               0.75*max.pos, max.pos),
#                      labels = fancy_scientific) +
#   expand_limits(y=0)

genomaths/MethylIT.utils documentation built on July 4, 2023, 12:05 a.m.