gen_results: Generate results

Description Usage Arguments Details Value Examples

View source: R/gen_results.R

Description

Generates EWCE results on multiple gene lists in parallel by calling ewce_para. It allows you to stop the analysis and then continue later from where you left off as it checks the results output directory for finished gene lists and removes them from the input. It also excludes gene lists with less than 4 unique genes (which cause errors in ewce analysis).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
gen_results(
  ctd,
  gene_data,
  list_names,
  background_genes,
  list_name_column = "Phenotype",
  gene_column = "Gene",
  results_dir = "results",
  overwrite_past_analysis = FALSE,
  reps = 10,
  annotLevel = 1,
  genelistSpecies = "human",
  sctSpecies = "human",
  cores = 1,
  MergeResults = FALSE
)

Arguments

ctd

The Cell type data file for EWCE analysis (see EWCE docs)

gene_data

The dataframe containing gene list names and associated genes (see docs for get_gene_list for more info).

list_names

The names of each gene list (e.g. "Abnormality of nervous system may be the name of a phenotype assocated gene list)

background_genes

A character vector of background genes (see EWCE docs)

list_name_column

The name of the column in gene_data that contains the gene list names, (e.g. the column may be called "Phenotype" if dealing with phenotype associated gene lists)

gene_column

The name of the column containing genes in the gene_data dataframe. Typically this column is called "Gene"

results_dir

the desired direcory to save results (e.g. "results")

overwrite_past_analysis

overwrite previous results in the results dir (bool)

reps

The number of bootstrap reps for EWCE (see ewce docs) (int)

annotLevel

The level of cell specificity to select from the CTD, See EWCE docs (int)

genelistSpecies

The species ("human"/"mouse") of the gene lists (string)

sctSpecies

The species ("human"/"mouse") of the CTD data

cores

The number of cores to run in parallel (int)

MergeResults

return merged to single data.frame as a .rds. Note: The function will return merged dataframe even if FALSE (bool)

Details

The gene_data should be a data frame that contains a column of gene list names (e.g. the column may be called "Phenotype"), and a column of genes (e.g. "Gene"). For example:

Phenotype Gene
"Abnormal heart" gene X
"Abnormal heart" gene Y
"Poor vision" gene Z
"Poor vision" gene Y
etc...

For more information on this see docs for get_gene_list (?get_gene_list)

If MergeResutls == TURE, the function will return a dataframe of all results. No multiple testing corrections are applied to this so it is recommended that they are done after, for example: all_results$q <- stats::p.adjust(all_results$p, method = "BH")

Value

If MergeResults is TRUE, it will return all results as a datframe. If FALSE nothing will be returned, but the individual results will still be saved in the results directory.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
gene_data <- HPOExplorer::load_phenotype_to_genes("phenotype_to_genes.txt")
ctd <- load_example_CTD()
list_names <- unique(gene_data$Phenotype)[1:20]
background_genes <- unique(gene_data$Gene)
list_name_column <- "Phenotype"
gene_column <- "Gene"
results_dir <- "results"
overwrite_past_analysis <- FALSE
MergeResults <- TRUE
reps <- 10
annotLevel <- 1
genelistSpecies <- "human"
sctSpecies <- "human"
cores <- 1

all_results <-MultiEWCE::gen_results(ctd, gene_data, list_names, background_genes,
                                     list_name_column, gene_column, results_dir,
                                     overwrite_past_analysis, reps, annotLevel,
                                     genelistSpecies, sctSpecies, cores,
                                     MergeResults)

ovrhuman/MultiEWCE documentation built on Dec. 22, 2021, 5:21 a.m.