Description Usage Arguments Details Value Examples
View source: R/prior_fitting.R
Use informative annotations to bias prior estimation towards alleles that show similar annotations in the provided annotation space.
1 2 3 4 5 6 7 8 9 10 | fit_cond_prior(
mpra_data,
annotations,
n_cores = 1,
plot_rep_cutoff = TRUE,
rep_cutoff = 0.15,
min_neighbors = 100,
kernel_fold_increase = 1.4142,
verbose = TRUE
)
|
mpra_data |
a data frame of mpra data |
annotations |
a data frame of annotations for the same variants in mpra_data |
n_cores |
number of cores to parallelize across |
plot_rep_cutoff |
logical indicating whether to plot the representation cutoff used |
rep_cutoff |
fraction indicating the depth-adjusted DNA count quantile to use as the cutoff |
min_neighbors |
The minimum number of neighbors in annotation space that must contribute to prior estimation |
kernel_fold_increase |
The amount to iteratively increase kernel width by when estimating conditional priors. Smaller values (closer to 1) will yield more refined priors but take longer. |
verbose |
logical indicating whether to print messages |
The empirical prior returned by this object is "conditional" in the sense that the prior estimation weights are conditional on the annotations.
The DNA prior is still estimated marginally because the annotations should not be able to provide any information on the DNA inputs (which are presumably only affected by the preparation of the oligonucleotide library at the vendor).
The RNA prior is estimated from the RNA observations of
other variants in the assay that are nearby in annotation space. A
multivariate t distribution centered on the variant in question is used to
weight all other variants in the assay. It is initialized with a very small
width, and if there are fewer than min_neighbors
that provide
substantial input to the prior, the width is iteratively increased by a
factor of kernel_fold_increase
until that condition is satisfied.
This prevents the prior estimation for variants in sparse regions of
annotation space from being influenced too heavily by their nearest
neighbors.
A list of two data frames. The first is for the DNA and the second is by-variant RNA priors.
1 2 3 4 5 6 | cond_prior = fit_cond_prior(mpra_data = umpra_example,
annotations = u_deepsea,
n_cores = 1,
rep_cutoff = .15,
plot_rep_cutoff = TRUE,
min_neighbors = 5)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.