auto_rep: auto_rep
In oriolarques/GEGVIC: A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep

R Documentation

auto_rep

Description

Executes all GEGVIC functions to generate an HTML report with the results.

Usage

auto_rep(
  ge_module = TRUE,
  gv_module = TRUE,
  ic_module = TRUE,
  out_dir = NULL,
  counts,
  genes_id,
  metadata,
  response,
  design,
  colors = c("black", "orange"),
  ref_level,
  shrink = "apeglm",
  biomart,
  fold_change = 2,
  p.adj = 0.05,
  gmt,
  gsea_pvalue = 0.2,
  gsva_gmt = "hallmark",
  method = "gsva",
  kcdf = "Gaussian",
  row.names = TRUE,
  col.names = TRUE,
  muts,
  top_genes = 10,
  specific_genes = NULL,
  compare = "wilcox.test",
  p_label = "p.format",
  gbuild,
  mut_sigs,
  tri.counts.method = "default",
  indications,
  cibersort = NULL,
  tumor = TRUE,
  rmgenes = NULL,
  scale_mrna = TRUE,
  expected_cell_types = NULL,
  points = TRUE
)

Arguments

`ge_module`	Logical value to determine whether this module is executed or not.
`gv_module`	Logical value to determine whether this module is executed or not.
`ic_module`	Logical value to determine whether this module is executed or not.
`out_dir`	Path to determine the output directory. By default it is set to the working directory.
`counts`	Data frame that contains gene expression data as raw counts.
`genes_id`	Name of the column that contains gene identifiers. Should be one of the following:'entrez_gene_id', 'ensemblgene_id' or 'hgnc_symbol'.
`metadata`	Data frame that contains supporting variables to the data.
`response`	QUOTED Name of the variable indicating the groups to analyse.
`design`	Variables in the design formula in the form of: 'Var1 + Var2 + ... Var_n'.
`colors`	Character vector indicating the colors of the different groups to compare. Default values are two: black and orange.
`ref_level`	Character vector where the first element is the column name where the reference level is located and a second element indicating the name of level to be used as a reference when calculating differential gene expression.
`shrink`	Name of the shrinkage method to apply: "apeglm", "ashr", "normal" or "none". Use none to skip shrinkage. Default value is "apeglm".
`biomart`	Data frame containing a biomaRt query with the following attributes: ensembl_gene_id, hgnc_symbol, entrezgene_id, transcript_length, refseq_mrna. In the case of mus musculus data, external_gene_name must be obtained and then change the column name for hgnc_symbol. Uploaded biomaRt queries in GEGVIC: 'ensembl_biomartGRCh37', ensembl_biomartGRCh38_p13' and 'ensembl_biomartGRCm38_p6', 'ensembl_biomartGRCm39'.
`fold_change`	An integer to define the fold change value to consider that a gene is differentially expressed.
`p.adj`	Numeric value to define the maximum adjusted p-value to consider that a gene is differentially expressed.
`gmt`	A data frame containg the gene sets to analyse using GSEA. This object should be obtained with the read.gmt function from the clusterProfiler package.
`gsea_pvalue`	Numeric value to define the adjusted pvalue cutoff during GSEA. Set to 0.2 by default.
`gsva_gmt`	Path to the gmt file that contain the gene sets of interest. By default the parameter is set to 'hallmark' which provides all HALLMARK gene sets from MSigDB (version 7.5.1).
`method`	Name of the method to perform Gene set variation analysis. The options are: 'gsva', 'ssgea' or 'zscore'. Default value is 'gsva'.
`kcdf`	Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when method="gsva". By default, "Gaussian" since GEGVIC transforms raw counts using the vst transformation. Other options are 'Poisson' or 'none'.
`row.names`	Logical value to determine if row-names are shown in the heatmap.
`col.names`	Logical value to determine if column-names are shown in the heatmap.
`muts`	Data frame containing genetic variations. Necessary columns must have the following names: - Hugo_Symbol: Gene symbol from HGNC. - Chromosome: Affected chromosome. - Start_Position: Mutation start coordinate. - End_Position: Mutation end coordinate. - Reference_Allele: The plus strand reference allele at this position. Includes the deleted sequence for a deletion or "-" for an insertion. - Tumor_Seq_Allele2: Tumor sequencing discovery allele. - Variant_Classification: Translational effect of variant allele. Can be one of the following: Frame_Shift_Del, Frame_Shift_Ins, In_Frame_Del, In_Frame_Ins, Missense_Mutation, Nonsense_Mutation, Silent, Splice_Site, Translation_Start_Site, Nonstop_Mutation, RNA, Targeted_Region. - Variant_Type: Type of mutation. Can be: 'SNP' (Single nucleotide polymorphism), 'DNP' (Double nucleotide polymorphism), 'INS' (Insertion), 'DEL' (Deletion). - Tumor_Sample_Barcode: Sample name.
`top_genes`	Number of genes to be analysed in the mutational summary.
`specific_genes`	Genes that will be plotted in the oncoplot.
`compare`	A character string indicating which method to be used for comparing means. Options are 't.test' and 'wilcox.test' for two groups or 'anova' and 'kruskal.test' for more groups. Default value is NULL.
`p_label`	Character string specifying label type. Allowed values include 'p.signif' (shows the significance levels), 'p.format' (shows the formatted p-value).
`gbuild`	Version of the genome to work with. It can be one of the following: - ‘BSgenome.Hsapiens.UCSC.hg19’ - ‘BSgenome.Hsapiens.UCSC.hg38’ - ‘BSgenome.Mmusculus.UCSC.mm10’ - ‘BSgenome.Mmusculus.UCSC.mm39’
`mut_sigs`	Mutational signature matrices containing the frequencies of all nucleotide changes per signature need to be indicated. GEGVIC contains the matrices from COSMIC for single and double base substitutions. To choose one, the user has to indicate ’COSMIC_vXX_YYBS_GRChZZ’ in the mut_sigs argument. The XX is the version, that can be v2 or v3.2. YY indicates if mutations are single (S) or double (D) base substitutions, while the ZZ is for the genome assembly, either GRCh37 or GRCh38 for human data and mm9 or mm10 for mouse data.
`tri.counts.method`	Normalization method. Needs to be set to either: - 'default' – no further normalization. - 'exome' – normalized by number of times each trinucleotide context is observed in the exome. - 'genome' – normalized by number of times each trinucleotide context is observed in the genome. - 'exome2genome' – multiplied by a ratio of that trinucleotide's occurence in the genome to the trinucleotide's occurence in the exome. - 'genome2exome' – multiplied by a ratio of that trinucleotide's occurence in the exome to the trinucleotide's occurence in the genome. - data frame containing user defined scaling factor – count data for each trinucleotide context is multiplied by the corresponding value given in the data frame.
`indications`	Character vector of cancer type codes for each sample in the tpm matrix.This is used by TIMER method. Indications supported can be checked using immunedeconv::timer_available_cancers. Default value is NULL.
`cibersort`	Path to the CIBERSORT.R and LM22.txt files. Default value is NULL.
`tumor`	Logical value to define if samples are tumors. If so EPIC and quanTIseq use a signature matrix/procedure optimized for tumor samples. Default value is TRUE.
`rmgenes`	A character vector of gene symbols. Exclude these genes from the analysis. Use this to exclude e.g. noisy genes.
`scale_mrna`	Logical. If FALSE, disable correction for mRNA content of different cell types. This is supported by methods that compute an absolute score (EPIC and quanTIseq). Default value is TRUE.
`expected_cell_types`	Limit the analysis to the cell types given in this list. If the cell types present in the sample are known a priori, setting this can improve results for xCell (see https://github.com/grst/immunedeconv/issues/1).
`points`	Logical value to decide if points are added to the plot.

Value

Returns an HTML report.

Examples

auto_rep(ge_module = TRUE,
         gv_module = TRUE,
         ic_module = TRUE,
         out_dir = NULL,
         counts = sample_counts,
         genes_id = 'ensembl_gene_id',
         metadata = sample_metadata,
         response = 'MSI_status',
         design = 'MSI_status',
         colors = c('orange', 'black'),
         ref_level = c('MSI_status', 'MSS'),
         shrink = 'apeglm',
         biomart = ensembl_biomart_GRCh38_p13,
         fold_change = 2,
         p.adj = 0.05,
         gmt = 'inst/extdata/c2.cp.reactome.v7.5.1.symbols.gmt',
         gsea_pvalue = 0.2,
         gsva_gmt = 'hallmark',
         method = 'gsva',
         kcdf = 'Gaussian',
         row.names = TRUE,
         col.names = TRUE,
         muts = sample_mutations,
         top_genes = 10,
         specific_genes = NULL,
         compare = 'wilcox.test',
         p_label = 'p.format',
         gbuild = 'BSgenome.Hsapiens.UCSC.hg38',
         mut_sigs = 'COSMIC_v2_SBS_GRCh38',
         tri.counts.method = 'default',
         indications = rep('COAD', ncol(sample_counts[-1])),
         cibersort = NULL,
         tumor = TRUE,
         rmgenes = NULL,
         scale_mrna = TRUE,
         expected_cell_types = NULL,
         points = TRUE)

oriolarques/GEGVIC documentation built on Oct. 30, 2024, 10:44 p.m.

oriolarques/GEGVIC index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

oriolarques/GEGVIC
A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep: auto_rep
In oriolarques/GEGVIC: A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep

Description

Usage

Arguments

Value

Examples

Related to auto_rep in oriolarques/GEGVIC...

R Package Documentation

Browse R Packages

We want your feedback!

oriolarques/GEGVIC A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep: auto_rep In oriolarques/GEGVIC: A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep

Description

Usage

Arguments

Value

Examples

Related to auto_rep in oriolarques/GEGVIC...

R Package Documentation

Browse R Packages

We want your feedback!

oriolarques/GEGVIC
A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.

auto_rep: auto_rep
In oriolarques/GEGVIC: A workflow to analyse Gene Expression, Genetic Variations and Immune cell Composition of tumour samples using Next Generation Sequencing data.