Description Usage Arguments Details Value Author(s) References See Also Examples
This is a general purpose gene set analysis method that downplays the importance of genes that apear often accross the sets of genes analyzed. The package provides also a benchmark for gene set analysis in terms of sensitivity and ranking using 24 public datasets.
1 2 3 |
esetm |
A matrix containing log transfomed and normalized gene expression data. Rows correspond to genes and columns to samples. |
group |
A character vector with the class labels of the samples. It can only contain "c" for control samples or "d" for disease samples. |
paired |
A logical value to indicate if the samples in the two groups are paired. |
block |
A character vector indicating the block ids of the samples classified by the group variable, if |
gslist |
Either the value "KEGGRESTpathway" or a list with the gene sets. If set to "KEGGRESTpathway", then gene sets will be made of all KEGG pathways for the |
annotation |
A valid chip annotation package if the rownames of |
organism |
A three letter string giving the name of the organism supported by the "KEGGREST" package. |
gs.names |
Character vector with the names of the gene sets. If specified, must have the same length as gslist. |
NI |
Number of iterations to determine the gene set score significance p-values. |
plots |
If set to TRUE then the distribution of the PADOG scores with and without weighting the genes in raw and standardized form are shown using boxplots.
A pdf file will be created in the current directory having the name provided in the |
targetgs |
The identifier of a traget gene set for which the scores will be highlighted in the plots produced if |
Nmin |
The minimum size of gene sets to be included in the analysis. |
verbose |
If set to TRUE, displays the number of iterations elapsed is displayed. |
parallel |
If set to TRUE, the |
dseed |
Optional initial seed for random number generator (integer). |
ncr |
The number of CPU cores used when |
See cited documents for more details.
A data frame containing the ranked pathways and various statistics: Name
is the name of the gene set;
ID
is the gene set identifier; Size
is the number of genes in the geneset; meanAbsT0
is the mean of absolute t-scores;
padog0
is the mean of weighted absolute t-scores;
PmeanAbsT
significance of the meanAbsT0; Ppadog
is the significance of the padog0 score;
Adi Laurentiu Tarca <atarca@med.wayne.edu>
Adi L. Tarca, Sorin Draghici, Gaurav Bhatti, Roberto Romero, Down-weighting overlapping genes improves gene set analysis, BMC Bioinformatics, 2012, submitted.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | #run padog on a colorectal cancer dataset of the 24 datasets benchmark GSE9348
#use NI=1000 for accurate results.
set="GSE9348"
data(list=set,package="KEGGdzPathwaysGEO")
x=get(set)
#Extract from the dataset the required info
exp=experimentData(x);
dataset= exp@name
dat.m=exprs(x)
ano=pData(x)
design= notes(exp)$design
annotation= paste(x@annotation,".db",sep="")
targetGeneSets= notes(exp)$targetGeneSets
myr=padog(
esetm=dat.m,
group=ano$Group,
paired=design=="Paired",
block=ano$Block,
targetgs=targetGeneSets,
annotation=annotation,
gslist="KEGGRESTpathway",
organism="hsa",
verbose=TRUE,
Nmin=3,
NI=25,
plots=FALSE,
dseed=1)
myr2=padog(
esetm=dat.m,
group=ano$Group,
paired=design=="Paired",
block=ano$Block,
targetgs=targetGeneSets,
annotation=annotation,
gslist="KEGGRESTpathway",
organism="hsa",
verbose=TRUE,
Nmin=3,
NI=25,
plots=FALSE,
dseed=1,
paral=TRUE,
ncr=2)
myr[1:20,]
all.equal(myr, myr2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.