Description Usage Arguments Details Value References See Also
Models Hi-C contacts using (robust) Negative Binomial (or Poisson when the data is underdispersed) regression. Given the fact that Hi-C data suffers from contact decay bias this method is intendent to model each diagonal separately. By default this function uses robust Negative Binomial regression to model interaction dependencies.
1 2 3 4 5 6 7 | constructGLM(
df,
robust.nb = TRUE,
overdisp.test.pval = 0.01,
max.nobs = 20000,
nrep = 10
)
|
df |
data frame with predictor, response, outlier columns |
robust.nb |
logical whther to use robust fitting procedure (see details) |
overdisp.test.pval |
numeric significance threshold for testing ovedispersion |
max.nobs |
numeric maximum number of observations (points), i.e. sample size to be taken for robust NB regression estimation (see details) |
nrep |
numeric number of repetitions for subsampling (see details) |
If robust.nb
is true then this function uses robust Negative Binomial estimation method developed in \insertCiteaeberhard2014robustDIADEM. This function uses the code of glmrob.nb function written by William Aeberhard, which is available at: https://github.com/williamaeberhard/glmrob.nb.
At first overdispersion test is performed to decide if Negative Binomial or Poisson regression should be used. If robust.nb
is true the estimation may consume huge amounts of memory for large sample sizes (like for example 400000 points). In order to prevent that whenever the sample size exceeds max.nobs
initial sample is subsampled to max.nobs
size and model is estimated on subsample. This procedure is repeated nrep
times and final parameter estimate equals average over subsampled estimates.
object of class glm or MASS::glm.nb
aeberhard2014robustDIADEM
glm
, glm.nb
to see how GLM are constructed, dispersiontest
to see how overdispersion is tested
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.