estimate_idr: Estimates IDR for Genomic Peaks or Genomic Interactions

View source: R/main.R

estimate_idrR Documentation

Estimates IDR for Genomic Peaks or Genomic Interactions

Description

Estimates IDR for Genomic Peaks or Genomic Interactions

Usage

estimate_idr(
  rep1_df,
  rep2_df,
  analysis_type = "IDR2D",
  value_transformation = c("identity", "additive_inverse", "multiplicative_inverse",
    "log", "log_additive_inverse"),
  ambiguity_resolution_method = c("overlap", "midpoint", "value"),
  remove_nonstandard_chromosomes = TRUE,
  max_factor = 1.5,
  jitter_factor = 1e-04,
  max_gap = -1L,
  mu = 0.1,
  sigma = 1,
  rho = 0.2,
  p = 0.5,
  eps = 0.001,
  max_iteration = 30,
  local_idr = TRUE
)

Arguments

rep1_df

data frame of observations (i.e., genomic peaks or genomic interactions) of replicate 1. If analysis_type is IDR1D, the columns of rep1_df are described in establish_bijection1d, otherwise in establish_bijection2d

rep2_df

data frame of observations (i.e., genomic peaks or genomic interactions) of replicate 2. Same columns as rep1_df.

analysis_type

"IDR2D" for genomic interaction data sets, "IDR1D" for genomic peak data sets

value_transformation

the values in x have to be transformed in a way such that when ordered in descending order, more significant interactions end up on top of the list. If the values in x are p-values, "log_additive_inverse" is recommended. The following transformations are supported:

"identity" no transformation is performed on x
"additive_inverse" x. = -x
"multiplicative_inverse" x. = 1 / x
"log" x. = log(x). Note: zeros are replaced by .Machine$double.xmin
"log_additive_inverse" x. = -log(x), recommended if x are p-values. Note: zeros are replaced by .Machine$double.xmin

either "ascending" (more significant interactions have lower value in value column) or "descending" (more significant interactions have higher value in value column)

ambiguity_resolution_method

defines how ambiguous assignments (when one interaction or peak in replicate 1 overlaps with multiple interactions or peaks in replicate 2 or vice versa) are resolved. For available methods, see establish_overlap1d or establish_overlap2d, respectively.

remove_nonstandard_chromosomes

removes peaks and interactions containing genomic locations on non-standard chromosomes using keepStandardChromosomes (default is TRUE)

max_factor

numeric; controls the replacement values for Inf and -Inf. Inf are replaced by max(x) * max_factor and -Inf are replaced by min(x) / max_factor.

jitter_factor

numeric; controls the magnitude of the noise that is added to x. This is done to break ties in x. Set jitter_factor = NULL for no jitter.

max_gap

integer; maximum gap in nucleotides allowed between two anchors for them to be considered as overlapping (defaults to -1, i.e., overlapping anchors)

mu

a starting value for the mean of the reproducible component.

sigma

a starting value for the standard deviation of the reproducible component.

rho

a starting value for the correlation coefficient of the reproducible component.

p

a starting value for the proportion of reproducible component.

eps

Stopping criterion. Iterations stop when the increment of log-likelihood is < eps*log-likelihood, Default=0.001.

max_iteration

integer; maximum number of iterations for IDR estimation (defaults to 30)

local_idr

see est.IDR

Value

See estimate_idr1d or estimate_idr2d, respectively.

References

Q. Li, J. B. Brown, H. Huang and P. J. Bickel. (2011) Measuring reproducibility of high-throughput experiments. Annals of Applied Statistics, Vol. 5, No. 3, 1752-1779.

Examples

idr_results <- estimate_idr(idr2d:::chiapet$rep1_df,
                            idr2d:::chiapet$rep2_df,
                            analysis_type = "IDR2D",
                            value_transformation = "log_additive_inverse")
summary(idr_results)


kkrismer/idr2d documentation built on Feb. 7, 2024, 2:23 p.m.