write_sig_genes: Output text files with lists of significant genes

Description Usage Arguments Details

View source: R/write_sig_genes.R

Description

This function outputs text files with the results from a differential expression analysis. It provides options to output based on FDR and logFC thresholds, positive and negative logFC, and ranked lists.

Usage

1
2
3
4
5
6
7
write_sig_genes(
  topGenes, file_prefix,
  method=c("ranked_list", "combined", "directional"),
  adj_p_cut=0.01, fc_cut=log2(1.5), fc_adj_factor=1,
  p_col="P.Value", adj_p_col="adj.P.Val", fc_col="logFC",
  threshold_col=NULL,
  input_type, output_type=input_type, use_annotables=TRUE)

Arguments

topGenes

a data frame, typically the output of a call to topTable. Must contain genes, log2 fold-change, and adjusted p-values.

file_prefix

name of the destination for files. Details of each output will be appended to this prefix.

method

character, specifying the type of gene lists to output. "ranked_list" outputs a list of all genes in ranked order by p-value (from smallest to largest). "combined" outputs a list of all significant genes meeting the threshold. "directional" outputs lists of significant genes meeting the threshold that are up- and down-regulated. Partial matches are allowed.

adj_p_cut

numeric, the cutoff for adjusted p-value. Genes with adjusted p-values greater than or equal to this value are not included in the result. Defaults to 0.01.

fc_cut

numeric, the absolute value cutoff for log2 fold change. Genes with absolute value log2-FC less than or equal to this value are not included in the result. Defaults to log2(1.5). To include all genes, set to 0. To ignore logFC, set to NULL.

fc_adj_factor

numeric, the adjustment factor used for log2-fold-change values with a numeric predictor. This is included so that the output file names can include the fold change prior to scaling. Defaults to 1, which is the appropriate value for categorical comparisons.

p_col

name or number of the column in topGenes on which to sort. Generally the raw p-values, as adjusted p-values are often homogenized across a range of raw p-values. Defaults to "P.Value", which corresponds to the output from topTable. To include all genes, set to >1.

adj_p_col

name or number of the column in topGenes containing the p-values to compare to p_cut. Defaults to "adj.P.Val", which corresponds to the output from topTable.

fc_col

name or number of the column in topGenes containing the fold-change values to compare to fc_cut. Defaults to "logFC", which corresponds to the output from topTable.

threshold_col

name or number of the column in topGenes containing the logical values indicating which genes meet thresholds. This is an alternate way to determine signficance of genes. If specified, p_cut and fc_cut are ignored.

input_type

the input gene identifier class. Must match a variable type in annotables or biomaRt. Defaults to "symbol" with annotables or "hgnc_symbol" with biomaRt.

output_type

the output gene identifier class. Must match a variable type in annotables or biomaRt. Defaults to

use_annotables

logical, whether to use the annotables package to convert output_type, if necessary. Only used if output_type is specified. If annotables is not installed, the function defaults to using biomaRt.

Details

This function writes out lists of genes to text files. By default, it outputs a list ranked by p-value, lists of genes significant based on FDR and logFC thresholds (all, up, and down).


BenaroyaResearch/limmaTools documentation built on Dec. 17, 2021, 10:49 a.m.