bayesian_mpra_analyze: Bayesian analysis of MPRA data

Description Usage Arguments Details

Description

Given MPRA data and a set of predictors, perform a Bayesian analysis of variants using an empirical prior

Usage

1
2
3
bayesian_mpra_analyze(mpra_data, predictors, use_marg_prior = FALSE, out_dir,
  mpra_model_object, save_nonfunctional = FALSE,
  normalization_method = "quantile_normalization", num_cores = 1)

Arguments

mpra_data

a data frame of mpra data

predictors

a matching data frame of annotations

use_marg_prior

logical indicating whether or not to disregard the functional predictors and use a marginal prior estimated from the entire assay

out_dir

a directory that you want the outputs written to. Make sure it ends with a forward slash.

mpra_model_object

a stan model object compiled with rstan::stan_model(model_code = mpra_model_string). mpra_model_string is a data object bundled with the package.

save_nonfunctional

logical indicating whether to save the sampler results of non-functional variants.

normalization_method

character vector indicating which method to use for aggregating information across samples. Must be either 'quantile_normalization' or 'depth_normalization'

num_cores

integer indicating how many cores to use for parallelization. Currently the analysis takes ~15s per variant on a first-gen i7 CPU, so setting this as high as possible is recommended as long as you have plenty of RAM.

Details

mpra_data must meet the following format conditions:

  1. one row per barcode

  2. one column of variant IDs (e.g. rs IDs)

  3. one column of alleles called 'allele'. These must be character strings of either "ref" or "mut"

  4. one additional column for every transfection/physical sample

  5. column names of plasmid library samples must contain "DNA" (e.g. "DNA_1", "DNA_2", ...)

  6. column names of samples from transcription products must contain "RNA" (e.g. "RNA_1", "RNA_2", ...)

save_nonfunctional defaults to FALSE as doing so can consume a large amount of storage space


andrewGhazi/bayesianMPRA documentation built on May 28, 2019, 4:56 p.m.