runBSure: runBSure

Description Usage Arguments Value Author(s) References Examples

View source: R/core_functions.R

Description

Run the BSure algorithm, which samples for each gene from the posterior distribution of gene essentiality and standard deviation parameter. Furthermore it provides credible intervals of the parameters, as well as convergence checks and information on whether (for each gene separately) a more or less expensive algorithm was required to achieve convergence. The output is saved as an .rda file and then used by postprocessing functions.

Usage

1
2
3
4
5
6
7
8
runBSure(
  lfc,
  save_file_name,
  n_cores = 1,
  min_tail_ESS = 500,
  vector_of_genes = NULL,
  plot_folder_name = NULL
)

Arguments

lfc

Mx(R+2) matrix or data frame of gRNA log-fold changes, where M is the number of gRNAs and R the number of replicates, the first column of the matrix contains the names of the gRNAs, the second gene names or identifiers corresponding to the gRNAs in the first column

save_file_name

name of the file (without .rda extension) to save the output

min_tail_ESS

minimum cutoff for the tail essential sample size, to decide on whether to chose a more expensive sampling algorithm, the default is set to 2000, the recommended minimum is 500. To increase the speed of the algorithm reduce from the default to a minimum of 500.

vector_of_genes

optional vector of genes to apply the BSure to only a subset of genes. Default is NULL, in which case the algorithm is applied to all genes.

plot_folder_name

If this parameter is not set to default NULL, a folder named plot_folder_name is created, plots to illustrate the posterior distriubtion of the essentiality score for each gene are stored in the folder. It is recommended that this option is only used with a limited number of genes, as specified by vector_of_genes.

Value

The output is saved as an .rda file. It contains a large amount of information that is used for plotting, assessment of screen quality, etc. by the postprocessing functions. For each gene it contains ess_bulk bulk essential sample size, number of independent samples corresponding to the total of the (dependent) samples drawn using no-U-turn-sampling rhat rhat convergence criterion (Brooks and Gelman, 1998); we recommend that this should be below 1.1 ess_tail tail essential sample size, like bulk essential sample size, but giving information on how well the algorithm was able to sample from the tails of the posterior distribution mean mean of the posterior distributions of gene essentiality, standard deviation parameter and log-likelihood quant05 0.05 quantiles of the posterior distributions quant025 0.025 quantiles of the posteirior distributions quant95 0.95 quantiles of the posterior distributions quant975 0.975 quantiles of the posterior distributions gene_names names of the genes in order corresponding to the output expensive_sampling binary variable for each gene, indicating whether more a more expensive sampling algorithm was needed very_expensive_sampling binary variable for each gene, indicating whether more the most expensive sampling algorithm was needed probability_essential_II probability of the posterior distribution of the essentiality being shifted less than 1/3 compared to a typical core-essential gene probability_essential_I probability of a gene being not nonessential (probability of not being nonessential)

Author(s)

Magdalena Strauss

References

Brooks S, Gelman A. General methods for monitoring convergence of iterative simulations. \empphJ Comput Graph Stat. 1998;7(4)434–455. doi10.1080/10618600.1998.10474787

Examples

1
2
data(HT29_lfc_small)
runBSure(lfc_small,save_file_name = "temp",n_cores = 2,min_tail_ESS = 500)

magStra/BSure documentation built on April 27, 2021, 3:30 a.m.