build_null_test_statistic: Simulate pairs to generate values of the test statistic under...

Description Usage Arguments Details Value Author(s) Examples

View source: R/sim_functions.R

Description

Generate samples from the test statistic under the null distribution - here we take the average rates of clonal exclusivity across trees, and also the histogram for each patient over all pairs with the values # clon. excl./#trees.

Usage

1
2
build_null_test_statistic(avg_rates_m, list_of_clon_excl_frac_trees_all_pats,
  num_pat_pair, num_pairs_sim, beta_distortion = 1000)

Arguments

avg_rates_m

The average rates of clonal exclusivity to be sampled from.

list_of_clon_excl_frac_trees_all_pats

The list of two lists. The first one contains a list entry for each patient containing the vector with the values of the information from each pair in a patient of how often it was mutated across trees. The second list entry is a list with an entry for each patient that is a vector with the values of in how many trees the pair was clonally exclusive. The patient ordering in the lists has to be the same as in avg_rates_m.

num_pat_pair

The number of patients the simulated pairs are mutated in.

num_pairs_sim

The number of simulated gene/pathway pairs to be generated.

beta_distortion

The value M=alpha + beta for the beta distribution, with which the average rates will be distorted. The bigger the M the higher the distribution is peaked around the actual rate. Therefore, the lesser the M, the more distorted the rates will be. Default: 1000.

Details

This function simulates gene pairs for the likelihood ratio test to generate values from the test statistic under the null. It draws the average rates of clonal exclusivity from the ones provided by the user. That is, the average rates of clonal exclusivity have to be computed first for each patient. The number of patients the simulated pairs are mutated in can be specified with num_pat_pair. This function can be used to build the ecdf of the test statistic under the null hypothesis (see Examples). The patients in which the simulated pairs are mutated in are randomly selected proportional to the number of pairs in a patient.

Value

The return value is a tibble with the columns 'test_statistic', 'mle_delta', and num_pat_pair columns with the respective rates that were drawn for each of the patients, num_pat_pair columns with the respective number of mutated times across trees, and num_pat_pair columns with the respective number of times of being clonally exclusive across trees, and num_pat_pair columns with the rate that was distorted by the beta distribution. The 'test_statistic' is the test statistic of the likelihood ratio test. The 'mle_delta' is the maximum likelihood estimate of the delta for the elevated clonal exclusivity rate in the alternative model of the likelihood ratio test.

Author(s)

Ariane L. Moore

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
avg_rates_m=c(0.4, 0.3)
list_of_clon_excl_frac_trees_all_pats <- list(list(c(5, 4, 5), c(5, 4)), 
                                  list(c(4, 4, 3), c(3, 2)))
sim_pairs <- build_null_test_statistic(avg_rates_m,
             list_of_clon_excl_frac_trees_all_pats, 2, 100, 
             beta_distortion=100)
ecdf_test_stat <- 
  ecdf(as.numeric(as.character(sim_pairs$test_statistic)))
plot(ecdf_test_stat, 
 main="ECDF of the test statistic when num_pat_pair=2")
# assume the observed test statistic t=6.0, 
# compute a p-value given the ecdf of
# the test statistic ecdf(T) from the null distribution
# p_value=P(T>t | H_0 true)=1-ecdf(t) ## (upper-tailed test)
p_value <- 1-ecdf_test_stat(6.0)

GeneAccord documentation built on Nov. 8, 2020, 8:04 p.m.