hic_loess | R Documentation |
Perform joint loess normalization on two Hi-C datasets
hic_loess(
hic.table,
degree = 1,
span = NA,
loess.criterion = "gcv",
Plot = FALSE,
Plot.smooth = TRUE,
parallel = FALSE,
BP_param = bpparam()
)
hic.table |
hic.table or a list of hic.tables generated from the create.hic.table function. list of hic.tables generated from the create.hic.table function. If you want to perform normalization over multiple chromosomes from each cell line at once utilizing parallel computing enter a list of hic.tables and set parallel = TRUE. |
degree |
Degree of polynomial to be used for loess. Options are 0, 1, 2. The default setting is degree = 1. |
span |
User set span for loess. If set to NA, the span will be selected automatically using the setting of loess.criterion. Defaults to NA so that automatic span selection is performed. If you know the span, setting it manually will significantly speed up computational time. |
loess.criterion |
Automatic span selection criterion. Can use either
'gcv' for generalized cross-validation or 'aicc' for Akaike Information
Criterion.
Span selection uses a slightly modified version of the |
Plot |
Logical, should the MD plot showing before/after loess normalization be output? Defaults to FALSE. |
Plot.smooth |
Logical, defaults to TRUE indicating the MD plot will be a smooth scatter plot. Set to FALSE for a scatter plot with discrete points. |
parallel |
Logical, set to TRUE to utilize the |
BP_param |
Parameters for BiocParallel. Defaults to bpparam(), see help for BiocParallel for more information http://bioconductor.org/packages/release/bioc/vignettes/BiocParallel/inst/doc/Introduction_To_BiocParallel.pdf |
The function takes in a hic.table or a list of hic.table objects created
with the create.hic.loess
function. If you wish to perform joint
normalization on Hi-C data for multiple chromosomes use a list of hic.tables.
The process can be parallelized using the parallel
setting. The data is fist transformed into what is termed an MD plot (similar
to the MA plot/Bland-Altman plot). M is the log difference log2(x/y) between
the two datasets. D is the unit distance in the contact matrix. The MD plot can
be visualized with the Plot
option. Loess regression is then
performed on the MD plot to model any biases between the two Hi-C datasets. An
adjusted IF is then calculated for each dataset along with an adjusted M.
See methods section of Stansfield & Dozmorov 2017 for more details. Note:
if you receive the warning "In simpleLoess(y, x, w, span, degree = degree,
parametric = parametric, ... :pseudoinverse used..." it should not effect
your results, however it can be avoided by manually setting the span to
a larger value using the span option.
An updated hic.table is returned with the additional columns of adj.IF1, adj.IF2 for the respective normalized IFs, an adj.M column for the adjusted M, mc for the loess correction factor, and A for the average expression value between adj.IF1 and adj.IF2.
# Create hic.table object using included Hi-C data in sparse upper
# triangular matrix format
data("HMEC.chr22")
data("NHEK.chr22")
hic.table <- create.hic.table(HMEC.chr22, NHEK.chr22, chr= 'chr22')
# Plug hic.table into hic_loess()
result <- hic_loess(hic.table, Plot = TRUE)
# View result
result
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.