hic_loess: Perform joint loess normalization on two Hi-C datasets

View source: R/hic_loess.R

hic_loessR Documentation

Perform joint loess normalization on two Hi-C datasets


Perform joint loess normalization on two Hi-C datasets


  degree = 1,
  span = NA,
  loess.criterion = "gcv",
  Plot = FALSE,
  Plot.smooth = TRUE,
  parallel = FALSE,
  BP_param = bpparam()



hic.table or a list of hic.tables generated from the create.hic.table function. list of hic.tables generated from the create.hic.table function. If you want to perform normalization over multiple chromosomes from each cell line at once utilizing parallel computing enter a list of hic.tables and set parallel = TRUE.


Degree of polynomial to be used for loess. Options are 0, 1, 2. The default setting is degree = 1.


User set span for loess. If set to NA, the span will be selected automatically using the setting of loess.criterion. Defaults to NA so that automatic span selection is performed. If you know the span, setting it manually will significantly speed up computational time.


Automatic span selection criterion. Can use either 'gcv' for generalized cross-validation or 'aicc' for Akaike Information Criterion. Span selection uses a slightly modified version of the loess.as() function from the fANCOVA package. Defaults to 'gcv'.


Logical, should the MD plot showing before/after loess normalization be output? Defaults to FALSE.


Logical, defaults to TRUE indicating the MD plot will be a smooth scatter plot. Set to FALSE for a scatter plot with discrete points.


Logical, set to TRUE to utilize the parallel package's parallelized computing. Only works on unix operating systems. Only useful if entering a list of hic.tables. Defauts to FALSE.


Parameters for BiocParallel. Defaults to bpparam(), see help for BiocParallel for more information http://bioconductor.org/packages/release/bioc/vignettes/BiocParallel/inst/doc/Introduction_To_BiocParallel.pdf


The function takes in a hic.table or a list of hic.table objects created with the create.hic.loess function. If you wish to perform joint normalization on Hi-C data for multiple chromosomes use a list of hic.tables. The process can be parallelized using the parallel setting. The data is fist transformed into what is termed an MD plot (similar to the MA plot/Bland-Altman plot). M is the log difference log2(x/y) between the two datasets. D is the unit distance in the contact matrix. The MD plot can be visualized with the Plot option. Loess regression is then performed on the MD plot to model any biases between the two Hi-C datasets. An adjusted IF is then calculated for each dataset along with an adjusted M. See methods section of Stansfield & Dozmorov 2017 for more details. Note: if you receive the warning "In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, ... :pseudoinverse used..." it should not effect your results, however it can be avoided by manually setting the span to a larger value using the span option.


An updated hic.table is returned with the additional columns of adj.IF1, adj.IF2 for the respective normalized IFs, an adj.M column for the adjusted M, mc for the loess correction factor, and A for the average expression value between adj.IF1 and adj.IF2.


# Create hic.table object using included Hi-C data in sparse upper
# triangular matrix format
hic.table <- create.hic.table(HMEC.chr22, NHEK.chr22, chr= 'chr22')
# Plug hic.table into hic_loess()
result <- hic_loess(hic.table, Plot = TRUE)
# View result

dozmorovlab/HiCcompare documentation built on June 30, 2023, 3:09 a.m.