estimate_idr1d | R Documentation |
This method estimates Irreproducible Discovery Rates (IDR) for peaks in replicated ChIP-seq experiments.
estimate_idr1d(
rep1_df,
rep2_df,
value_transformation = c("identity", "additive_inverse", "multiplicative_inverse",
"log", "log_additive_inverse"),
ambiguity_resolution_method = c("overlap", "midpoint", "value"),
remove_nonstandard_chromosomes = TRUE,
max_factor = 1.5,
jitter_factor = 1e-04,
max_gap = -1L,
mu = 0.1,
sigma = 1,
rho = 0.2,
p = 0.5,
eps = 0.001,
max_iteration = 30,
local_idr = TRUE
)
rep1_df |
data frame of observations (i.e., genomic peaks) of replicate 1, with at least the following columns (position of columns matter, column names are irrelevant):
| ||||||||||||
rep2_df |
data frame of observations (i.e., genomic peaks) of replicate 2, with the following columns (position of columns matter, column names are irrelevant):
| ||||||||||||
value_transformation |
the values in
either | ||||||||||||
ambiguity_resolution_method |
defines how ambiguous assignments (when one interaction in replicate 1 overlaps with multiple interactions in replicate 2 or vice versa) are resolved. Available methods:
| ||||||||||||
remove_nonstandard_chromosomes |
removes peaks containing
genomic locations on non-standard chromosomes using
| ||||||||||||
max_factor |
numeric; controls the replacement values for | ||||||||||||
jitter_factor |
numeric; controls the magnitude of the noise that
is added to | ||||||||||||
max_gap |
integer; maximum gap in nucleotides allowed between two anchors for them to be considered as overlapping (defaults to -1, i.e., overlapping anchors) | ||||||||||||
mu |
a starting value for the mean of the reproducible component. | ||||||||||||
sigma |
a starting value for the standard deviation of the reproducible component. | ||||||||||||
rho |
a starting value for the correlation coefficient of the reproducible component. | ||||||||||||
p |
a starting value for the proportion of reproducible component. | ||||||||||||
eps |
Stopping criterion. Iterations stop when the increment of log-likelihood is < eps*log-likelihood, Default=0.001. | ||||||||||||
max_iteration |
integer; maximum number of iterations for IDR estimation (defaults to 30) | ||||||||||||
local_idr |
see |
List with three components, (rep1_df
, rep2_df
,
and analysis_type
) containing the interactions from input
data frames rep1_df
and rep2_df
with
the following additional columns:
column 1: | chr | character; genomic location of peak -
chromosome (e.g., "chr3" ) |
column 2: | start | integer; genomic location of peak - start coordinate |
column 3: | end | integer; genomic location of peak - end coordinate |
column 4: | value | numeric; p-value, FDR, or heuristic used to rank the peaks |
column 5: | rep_value | numeric; value of corresponding
replicate peak. If no corresponding peak was found, rep_value is set
to NA . |
column 6: | rank | integer; rank of the peak, established by value column, ascending order |
column 7: | rep_rank | integer; rank of corresponding
replicate peak. If no corresponding peak was found, rep_rank is
set to NA . |
column 8: | idx | integer; peak index, primary key |
column 9: | rep_idx | integer; specifies the index of the
corresponding peak in the other replicate (foreign key). If no
corresponding peak was found, rep_idx is set to NA . |
column 10: | idr | IDR of the peak and the
corresponding peak in the other replicate. If no corresponding
peak was found, idr is set to NA .
|
Q. Li, J. B. Brown, H. Huang and P. J. Bickel. (2011) Measuring reproducibility of high-throughput experiments. Annals of Applied Statistics, Vol. 5, No. 3, 1752-1779.
idr_results <- estimate_idr1d(idr2d:::chipseq$rep1_df,
idr2d:::chipseq$rep2_df,
value_transformation = "log")
summary(idr_results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.