sim_reg: Similarity regression

Description Usage Arguments Examples

View source: R/inference.R

Description

Performs Bayesian ‘similarity regression’ on given logical response vector y against list of ontological term sets x. It returns an object of class sim_reg_output. Of particular interest are the probability of an association, which can be calculated with prob_association, and the characteristic ontological profile phi, which can be visualised using the functions plot_term_marginals, and term_marginals). The results can be summarised with summary.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
sim_reg(
  ontology,
  x,
  y,
  information_content = get_term_info_content(ontology, x),
  sim_params = list(ontology = ontology, information_content = information_content),
  using_terms = get_terms(sim_params),
  term_weights = rep(0, length(using_terms)),
  prior = discrete_gamma(using_terms),
  min_BF = -Inf,
  max_select = 2000L,
  max_phi_count = 200L,
  two_way = TRUE,
  selection_fn = fg_step_tab(N = length(y)),
  lik_method = NULL,
  lik_method_args = list(),
  gamma0_ml = bg_rate,
  min_ratio = 1e-04,
  ...
)

Arguments

ontology

ontology_index object.

x

list of character vectors of ontological terms.

y

logical response vector.

information_content

Numeric vector of information contents of terms named by term ID. Defaults to information content based on frequencies of annotation in x.

sim_params

List of arguments to pass to get_asym_sim_grid.

using_terms

Character vector of term IDs giving the complete set of terms to include in the the phi parameter space.

term_weights

Numeric vector of prior weights for individual terms.

prior

Function for computing the unweighted prior probability of a phi value.

min_BF

Bayes factor threshold below which to terminate computation, enabling faster execution time at the expense of accuracy and precision.

max_select

Upper bound for number of phi values to sample.

max_phi_count

Upper bound for number of phi values to include in final likelihood sum.

two_way

Boolean value determining whether to calculate semantic similarity ‘in both directions’ (i.e. compute s_x and s_phi or just s_phi).

selection_fn

Function for selecting values of phi with high posterior mass.

lik_method

Function for calculating marginal likelihood contional on values of phi.

lik_method_args

List of additional arguments to pass to lik_method.

gamma0_ml

Function for computing marginal likelihood of data under baseline model gamma=0.

min_ratio

Lower bound on ratio below which to discard phi values.

...

Additional arguments to pass to selection_fn.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
set.seed(0)
data(hpo)
disease_terms <- c("HP:0005537", "HP:0000729", "HP:0001873")
all_terms <- get_ancestors(hpo, 
c(disease_terms, sample(hpo$id, size=50)))
y <- c(rep(FALSE, 96), rep(TRUE, 3))
x <- lapply(y, function(.y) minimal_set(
hpo, if (!.y) sample(all_terms, size=3) else 
	c(sample(all_terms, size=1), disease_terms[runif(n=3) < 0.8])))
sim_reg_out <- sim_reg(ontology=hpo, x=x, y=y)

## End(Not run)

SimReg documentation built on Feb. 15, 2021, 5:10 p.m.