README.md

gsLRT

Gene set enrichment from likelihood ratio statistics

Running gsLRT on small scale genomics data

To run gsLRT on your own genomics data, open the R console or an RStudio window and follow the steps below. To make the code included here compatible with your data with only minor modifications you will need the following:

With these objects in hand, here is an example gsLRT run. In this example, we are specifically testing for whether an interaction between the phenotype Z and sex is significantly associated with any gene sets.

First, read in the data:

require(magrittr)
require(plyr)
require(tidyverse)

## Read R objects
G <- readRDS("results/objects/G.rds")
Z <- readRDS("results/objects/Z.rds")
X <- readRDS("results/objects/X.rds")
gene_sets <- readRDS("results/objects/gene_sets.rds")

Next, run the permutation-based version of gsLRT with

ES_df <- gsLRT::gsLRT(Z, X, G, interaction_term = "sex", gene_sets = gene_sets, n_perm = 2000)

This gives you a data.frame of gene sets and their permutation p-values. While the permutation-based test is recommended, depending on the size of your data, it may be prohibitively slow. In that case, a less conservative chi-squared test may be used:

Lambdas <- plyr::aaply(1:ncol(G), 1, function(j){

  gsLRT::lr_stats(y = G[, j], X = X, z = Z, interaction_term = "sex",
                  stat = "bartlett")

}, .progress = "text") %>%
  t() %>%
  magrittr::set_colnames(colnames(G))
ES <- gsLRT::enrich_score(Lambdas, gene_sets, trim = T, 
                           min_size = 5, stat = "chisq")
ES_df <- gsLRT::chisq_test(ES, gene_sets)

## Plot a figure
gsLRT::plot_results(ES_df, Lambdas, gene_sets)


j-g-b/gsLRT documentation built on March 20, 2021, 8:14 p.m.