rna.workflow: RNA sequencing workflow

rna.workflowR Documentation

RNA sequencing workflow

Description

This function performs a complete RNA sequencing workflow, including imputation of missing values, normalization, principal component analysis, differential expression analysis, and pathway analysis. The function also provides several options for plotting, exporting plots, and creating a report.

Usage

rna.workflow(
  se,
  imp_fun = c("zero", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb", "min",
    "zero", "mixed", "nbavg", "SampMin"),
  q = 0.01,
  knn.rowmax = 0.5,
  type = c("all", "control", "manual"),
  design = "~ condition",
  size.factors = NULL,
  altHypothesis = c("greaterAbs", "lessAbs", "greater", "less"),
  control = NULL,
  contrast = NULL,
  controlGenes = NULL,
  pAdjustMethod = c("IHW", "BH"),
  alpha = 0.05,
  alpha.independent = 0.1,
  alpha_pathways = 0.1,
  lfcShrink = TRUE,
  shrink.method = c("apeglm", "ashr", "normal"),
  lfc = 2,
  heatmap.show_all = TRUE,
  heatmap.kmeans = F,
  k = 6,
  heatmap.col_limit = NA,
  heatmap.show_row_names = TRUE,
  heatmap.row_font_size = 6,
  volcano.add_names = FALSE,
  volcano.label_size = 2.5,
  volcano.adjusted = TRUE,
  plot = FALSE,
  export = FALSE,
  report = TRUE,
  report.dir = NULL,
  pathway_enrichment = FALSE,
  pathway_kegg = FALSE,
  kegg_organism = NULL,
  custom_pathways = NULL,
  quiet = FALSE
)

Arguments

se

A SummarizedExperiment object, generated with read_prot().

imp_fun

(Character string) Function used for data imputation. "SampMin", "man", "bpca", "knn", "QRILC", "MLE", "MinDet", "MinProb", "min", "zero", "mixed", or "nbavg". See (rna.impute) for details.

q

(Numeric) q value for imputing missing values with method imp_fun = 'MinProb'.

knn.rowmax

(Numeric) The maximum percent missing data allowed in any row for imp_fun = 'knn'. Default: 0.5.

type

(Character string) Type of differential analysis to perform. "all" (contrast each condition with every other condition), "control" (contrast each condition to a defined control condition), "manual" (manually define selected conditions).

design

Formula for the design matrix.

size.factors

Optional: Manually define size factors for normalization.

altHypothesis

Specify those genes you are interested in finding. The test provides p values for the null hypothesis, the complement of the set defined by altHypothesis. For further details, see results.

control

Control condition; required if type = "control".

contrast

(String or vector of strings) Defined test(s) for differential analysis in the form "A_vs_B"; required if type = "manual".

controlGenes

Specifying those genes to use for size factor estimation (e.g. housekeeping or spike-in genes).

pAdjustMethod

Method for adjusting p values. Available options are "IHW" (Independent Hypothesis Weighting),"BH" (Benjamini-Hochberg).

alpha

Significance threshold for adjusted p values.

alpha.independent

Adjusted p value threshold for independent filtering or NULL. If the adjusted p-value cutoff (FDR) will be a value other than 0.1, alpha should be set to that value.

alpha_pathways

Significance threshold for pathway analysis.

lfcShrink

Use shrinkage to calculate log2 fold change values.

shrink.method

Method for shrinkage. Available options are "apeglm", "ashr", "normal". See lfcShrink for details.

lfc

Relevance threshold for absolute log2(fold change) values. Used to filter unshrunken lfc values or in shrinkage method "apeglm" or "normal".

heatmap.show_all

Shall all samples be displayed in the heatmap or only the samples contained in the defined "contrast"? (only applicable for type = "manual")

heatmap.kmeans

Shall the proteins be clustered in the heat map?

k

Number of protein clusters in heat map if kmeans = TRUE.

heatmap.col_limit

Define the breaks in the heat map legends.

heatmap.show_row_names

Show protein names in heat map?

heatmap.row_font_size

Font size of protein names if show_row_names = TRUE.

volcano.add_names

Display names next to symbols in volcano plot.

volcano.label_size

Size of labels in volcano plot.

volcano.adjusted

Display adjusted p-values on y axis of volcano plot?

plot

Shall plots be returned in the Plots pane?

export

Shall plots be exported as PDF and PNG files?

report

Shall a report (HTML and PDF) be created?

report.dir

Folder name for created report (if report = TRUE)

pathway_enrichment

Perform pathway over-representation analysis for each tested contrast

pathway_kegg

Perform pathway over-representation analysis with gene sets in the KEGG database

kegg_organism

Name of the organism in the KEGG database (if 'pathway_kegg = TRUE')

custom_pathways

Dataframe providing custom pathway annotations.

quiet

Suppress messages and warnings.

Value

The function returns a SummarizedExperiment object with added columns for log2 fold change, p-values and adjusted p-values for each comparison. It also includes a column for significant genes for each comparison and a column for significant genes overall. Additionally, the function generates various plots and a report (if specified).


NicWir/VisomX documentation built on Dec. 8, 2024, 1:27 a.m.