estimate_idr2d | R Documentation |
This method estimates Irreproducible Discovery Rates (IDR) between two replicates of experiments identifying genomic interactions, such as Hi-C, ChIA-PET, and HiChIP.
estimate_idr2d(
rep1_df,
rep2_df,
value_transformation = c("identity", "additive_inverse", "multiplicative_inverse",
"log", "log_additive_inverse"),
ambiguity_resolution_method = c("overlap", "midpoint", "value"),
remove_nonstandard_chromosomes = TRUE,
max_factor = 1.5,
jitter_factor = 1e-04,
max_gap = -1L,
mu = 0.1,
sigma = 1,
rho = 0.2,
p = 0.5,
eps = 0.001,
max_iteration = 30,
local_idr = TRUE
)
rep1_df |
data frame of observations (i.e., genomic interactions) of replicate 1, with at least the following columns (position of columns matter, column names are irrelevant):
| |||||||||||||||||||||
rep2_df |
data frame of observations (i.e., genomic interactions) of replicate 2, with the following columns (position of columns matter, column names are irrelevant):
| |||||||||||||||||||||
value_transformation |
the values in
either | |||||||||||||||||||||
ambiguity_resolution_method |
defines how ambiguous assignments (when one interaction in replicate 1 overlaps with multiple interactions in replicate 2 or vice versa) are resolved. Available methods:
| |||||||||||||||||||||
remove_nonstandard_chromosomes |
removes interactions
containing
genomic locations on non-standard chromosomes using
| |||||||||||||||||||||
max_factor |
numeric; controls the replacement values for | |||||||||||||||||||||
jitter_factor |
numeric; controls the magnitude of the noise that
is added to | |||||||||||||||||||||
max_gap |
integer; maximum gap in nucleotides allowed between two anchors for them to be considered as overlapping (defaults to -1, i.e., overlapping anchors) | |||||||||||||||||||||
mu |
a starting value for the mean of the reproducible component. | |||||||||||||||||||||
sigma |
a starting value for the standard deviation of the reproducible component. | |||||||||||||||||||||
rho |
a starting value for the correlation coefficient of the reproducible component. | |||||||||||||||||||||
p |
a starting value for the proportion of reproducible component. | |||||||||||||||||||||
eps |
Stopping criterion. Iterations stop when the increment of log-likelihood is < eps*log-likelihood, Default=0.001. | |||||||||||||||||||||
max_iteration |
integer; maximum number of iterations for IDR estimation (defaults to 30) | |||||||||||||||||||||
local_idr |
see |
List with three components, (rep1_df
, rep2_df
,
and analysis_type
) containing the interactions from input
data frames rep1_df
and rep2_df
with
the following additional columns:
column 1: | chr_a | character; genomic location of anchor A -
chromosome (e.g., "chr3" ) |
column 2: | start_a | integer; genomic location of anchor A - start coordinate |
column 3: | end_a | integer; genomic location of anchor A - end coordinate |
column 4: | chr_b | character; genomic location of anchor B -
chromosome (e.g., "chr3" ) |
column 5: | start_b | integer; genomic location of anchor B - start coordinate |
column 6: | end_b | integer; genomic location of anchor B - end coordinate |
column 7: | value | numeric; p-value, FDR, or heuristic used to rank the interactions |
column 8: | "rep_value" | numeric; value of corresponding
replicate interaction. If no corresponding interaction was found,
rep_value is set to NA . |
column 9: | "rank" | integer; rank of the interaction, established by value column, ascending order |
column 10: | "rep_rank" | integer; rank of corresponding
replicate interaction. If no corresponding interaction was found,
rep_rank is set to NA . |
column 11: | "idx" | integer; interaction index, primary key |
column 12: | "rep_idx" | integer; specifies the index of the
corresponding interaction in the other replicate (foreign key). If no
corresponding interaction was found, rep_idx is set to NA . |
idr | IDR of the interaction and the
corresponding interaction in the other replicate. If no corresponding
interaction was found, idr is set to NA .
|
Q. Li, J. B. Brown, H. Huang and P. J. Bickel. (2011) Measuring reproducibility of high-throughput experiments. Annals of Applied Statistics, Vol. 5, No. 3, 1752-1779.
idr_results <- estimate_idr2d(idr2d:::chiapet$rep1_df,
idr2d:::chiapet$rep2_df,
value_transformation = "log_additive_inverse")
summary(idr_results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.