MIGSAmGSZ: MIGSAmGSZ
In jcrodriguez1989/MIGSA: Massive and Integrative Gene Set Analysis

Description Usage Arguments Value Examples

MIGSAmGSZ is an optimized mGSZ version. It runs much faster than the original mGSZ version, moreover it can run in multicore technology. It allows to analyze RNAseq data by using voom function. mGSZ: Gene set analysis based on Gene Set Z scoring function and asymptotic p-value.

MIGSAmGSZ(x, y, l, ...)

## S4 method for signature 'matrix,list,vector'
MIGSAmGSZ(
  x,
  y,
  l,
  use.voom = FALSE,
  rankFunction = NA,
  min.sz = 5,
  pv = 0,
  w1 = 0.2,
  w2 = 0.5,
  vc = 10,
  p = 200
)

`x`	gene expression data matrix (rows as genes and columns as samples).
`y`	gene set data (list).
`l`	vector of response values (example:c("Cond1","Cond1","Cond2", "Cond2","Cond2")).
`...`	not in use.
`use.voom`	logical indicating wether use voom or not (if RNAseq data we recommend using use.voom=TRUE).
`rankFunction`	internal use.
`min.sz`	minimum size of gene sets (number of genes in a gene set) to be included in the analysis.
`pv`	estimate of the variance associated with each observation.
`w1`	weight 1, parameter used to calculate the prior variance obtained with class size var.constant. This penalizes especially small classes and small subsets. Values around 0.1 - 0.5 are expected to be reasonable.
`w2`	weight 2, parameter used to calculate the prior variance obtained with the same class size as that of the analyzed class. This penalizes small subsets from the gene list. Values around 0.3 and 0.5 are expected to be reasonable.
`vc`	size of the reference class used with wgt1.
`p`	number of permutations for p-value calculation.

A data.frame with gene sets p-values and additional information.

nGenes <- 1000
# 1000 genes
nSamples <- 30
# 30 subjects
geneNames <- paste("g", 1:nGenes, sep = "")
# with names g1 ... g1000
## Create random gene expression data matrix.
set.seed(8818)
exprData <- matrix(rnorm(nGenes * nSamples), ncol = nSamples)
rownames(exprData) <- geneNames
## There will be 40 differentialy expressed genes.
nDeGenes <- nGenes / 25
## Lets generate the offsets to sum to the differentialy expressed genes.
deOffsets <- matrix(2 * abs(rnorm(nDeGenes * nSamples / 2)), ncol = nSamples / 2)
## Randomly select which are the DE genes.
deIndexes <- sample(1:nGenes, nDeGenes, replace = FALSE)
exprData[deIndexes, 1:(nSamples / 2)] <-
  exprData[deIndexes, 1:(nSamples / 2)] + deOffsets
## 15 subjects with condition C1 and 15 with C2.
conditions <- rep(c("C1", "C2"), c(nSamples / 2, nSamples / 2))

nGSets <- 200
# 200 gene sets
## Lets create randomly 200 gene sets, of 10 genes each
gSets <- lapply(1:nGSets, function(i) sample(geneNames, size = 10))
names(gSets) <- paste("set", as.character(1:nGSets), sep = "")
## Not run: 
mGSZres <- MIGSAmGSZ(exprData, gSets, conditions)

## End(Not run)