estimate_idr2d_hic: Estimates IDR for Genomic Interactions measured by Hi-C...

View source: R/hic.R

estimate_idr2d_hicR Documentation

Estimates IDR for Genomic Interactions measured by Hi-C experiments

Description

This method estimates Irreproducible Discovery Rates (IDR) of genomic interactions between two replicates of Hi-C experiments.

Before calling this method, call Juicer .hic contact matrix c

The contact matrix is subdivided into blocks, where the block size is determined by resolution. The reads per block are used to rank blocks and replicate blocks are easily matched by genomic location.

Usage

estimate_idr2d_hic(
  rep1_df,
  rep2_df,
  combined_min_value = 30,
  combined_max_value = Inf,
  min_value = -Inf,
  max_value = Inf,
  max_factor = 1.5,
  jitter_factor = 1e-04,
  mu = 0.1,
  sigma = 1,
  rho = 0.2,
  p = 0.5,
  eps = 0.001,
  max_iteration = 30,
  local_idr = TRUE
)

Arguments

rep1_df

data frame of either parsed .hic file from Juicer (output of parse_juicer_matrix) or parsed .matrix and .bed files from HiC-Pro (output of parse_hic_pro_matrix) for replicate 1

rep2_df

data frame of either parsed .hic file from Juicer (output of parse_juicer_matrix) or parsed .matrix and .bed files from HiC-Pro (output of parse_hic_pro_matrix) for replicate 2

combined_min_value

exclude blocks with a combined (replicate 1 + replicate 2) read count or normalized read count of less than combined_min_value (default is 20 reads, set combined_min_value = -Inf to disable)

combined_max_value

exclude blocks with a combined (replicate 1 + replicate 2) read count or normalized read count of more than combined_max_value (disabled by default, set combined_max_value = Inf to disable)

min_value

exclude blocks with a read count or normalized read count of less than min_value in one replicate (disabled by default, set min_value = -Inf to disable)

max_value

exclude blocks with a read count or normalized read count of more than max_value in one replicate (disabled by default, set max_value = Inf to disable)

max_factor

numeric; controls the replacement values for Inf and -Inf. Inf are replaced by max(x) * max_factor and -Inf are replaced by min(x) / max_factor.

jitter_factor

numeric; controls the magnitude of the noise that is added to x. This is done to break ties in x. Set jitter_factor = NULL for no jitter.

mu

a starting value for the mean of the reproducible component.

sigma

a starting value for the standard deviation of the reproducible component.

rho

a starting value for the correlation coefficient of the reproducible component.

p

a starting value for the proportion of reproducible component.

eps

Stopping criterion. Iterations stop when the increment of log-likelihood is < eps*log-likelihood, Default=0.001.

max_iteration

integer; maximum number of iterations for IDR estimation (defaults to 30)

local_idr

see est.IDR

Value

Data frame with the following columns:

column 1: interaction character; genomic location of interaction block (e.g., "chr1:204940000-204940000")
column 2: value numeric; p-value, FDR, or heuristic used to rank the interactions
column 3: "rep_value" numeric; value of corresponding replicate interaction
column 4: "rank" integer; rank of the interaction, established by value column, ascending order
column 5: "rep_rank" integer; rank of corresponding replicate interaction
column 6: "idr" integer; IDR of the block and the corresponding block in the other replicate

References

Q. Li, J. B. Brown, H. Huang and P. J. Bickel. (2011) Measuring reproducibility of high-throughput experiments. Annals of Applied Statistics, Vol. 5, No. 3, 1752-1779.

Examples

idr_results_df <- estimate_idr2d_hic(idr2d:::hic$rep1_df,
                                     idr2d:::hic$rep2_df)
summary(idr_results_df)


kkrismer/idr2d documentation built on Feb. 7, 2024, 2:23 p.m.