| identify_outliers | R Documentation | 
This function runs the data modeling and statistical test for the hypothesis that a transcript includes outlier biological replicate.
\lifecyclematuring
identify_outliers(
  .data,
  formula = ~1,
  .sample,
  .transcript,
  .abundance,
  .significance,
  .do_check,
  .scaling_factor = NULL,
  percent_false_positive_genes = 1,
  how_many_negative_controls = 500,
  approximate_posterior_inference = TRUE,
  approximate_posterior_analysis = TRUE,
  draws_after_tail = 10,
  save_generated_quantities = FALSE,
  additional_parameters_to_save = c(),
  cores = detect_cores(),
  pass_fit = FALSE,
  do_check_only_on_detrimental = length(parse_formula(formula)) > 0,
  tol_rel_obj = 0.01,
  just_discovery = FALSE,
  seed = sample(seq_len(length.out = 999999), size = 1),
  adj_prob_theshold_2 = NULL
)
.data | 
 A tibble including a transcript name column | sample name column | read counts column | covariate columns | Pvalue column | a significance column  | 
formula | 
 A formula. The sample formula used to perform the differential transcript abundance analysis  | 
.sample | 
 A column name as symbol. The sample identifier  | 
.transcript | 
 A column name as symbol. The transcript identifier  | 
.abundance | 
 A column name as symbol. The transcript abundance (read count)  | 
.significance | 
 A column name as symbol. A column with the Pvalue, or other significance measure (preferred Pvalue over false discovery rate)  | 
.do_check | 
 A column name as symbol. A column with a boolean indicating whether a transcript was identified as differentially abundant  | 
.scaling_factor | 
 In case the scaling factor must not be calculated (TMM method) using the input data but provided. It is useful, for example, for pseudobulk single-cell where the scaling might depend on sample sequencing depth for all cells rather than a particular cell type.  | 
percent_false_positive_genes | 
 A real between 0 and 100. It is the aimed percent of transcript being a false positive. For example, percent_false_positive_genes = 1 provide 1 percent of the calls for outlier containing transcripts that has actually not outliers.  | 
how_many_negative_controls | 
 An integer. How many transcript from the bottom non-significant should be taken for inferring the mean-overdispersion trend.  | 
approximate_posterior_inference | 
 A boolean. Whether the inference of the joint posterior distribution should be approximated with variational Bayes It confers execution time advantage.  | 
approximate_posterior_analysis | 
 A boolean. Whether the calculation of the credible intervals should be done semi-analytically, rather than with pure sampling from the posterior. It confers execution time and memory advantage.  | 
draws_after_tail | 
 An integer. How many draws should on average be after the tail, in a way to inform CI.  | 
save_generated_quantities | 
 A boolean. Used for development and testing purposes  | 
additional_parameters_to_save | 
 A character vector. Used for development and testing purposes  | 
cores | 
 An integer. How many cored to be used with parallel calculations.  | 
pass_fit | 
 A boolean. Used for development and testing purposes  | 
do_check_only_on_detrimental | 
 A boolean. Whether to test only for detrimental outliers (same direction as the fold change). It allows to test for less transcript/sample pairs and therefore higher the probability threshold.  | 
tol_rel_obj | 
 A real. Used for development and testing purposes  | 
just_discovery | 
 A boolean. Used for development and testing purposes  | 
seed | 
 An integer. Used for development and testing purposes  | 
adj_prob_theshold_2 | 
 A boolean. Used for development and testing purposes  | 
A nested tibble tbl with transcript-wise information: sample_wise_data | plot | ppc samples failed | tot deleterious_outliers
library(dplyr)
data("counts")
if(Sys.info()[['sysname']] == "Linux")
result =
  counts %>%
  dplyr::mutate(  is_significant = ifelse(symbol %in% c("SLC16A12", "CYP1A1", "ART3"), TRUE, FALSE) ) %>%
 ppcseq::identify_outliers(
	formula = ~ Label,
	sample, symbol, value,
	.significance = PValue,
	.do_check  = is_significant,
	percent_false_positive_genes = 1,
	tol_rel_obj = 0.01,
	approximate_posterior_inference =TRUE,
	approximate_posterior_analysis =TRUE,
	how_many_negative_controls = 50,
	cores=1
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.