agdex: Agreement of Differential Expression Analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/agdex.R

Description

This function performs agreement of differential expression (AGDEX) analysis across a pair of two-group experiments. AGDEX measures and determines the statistical significance of the similarity of the results from two experiments that measure differential expression across two groups. A metric of agreement is defined to measure the similarity and the significance is determined by permutation of group labels. Please see our methodology paper for details [1] (Pounds et al. 2011).

Usage

1
agdex(dex.setA, dex.setB, map.data, min.nperms = 100, max.nperms = 10000)

Arguments

dex.setA

A list object with 4 components that defines a two-group comparison "A", for example "human tumor-human control". These components are express.set, comp.def, comp.variable and gset.collection (optional). The express.set component is a Bioconductor ExpressionSet object with a matrix of expression data in exprs and the phenotype data in pData. The comp.variable component gives the name or numeric index of the column of group label in pData of express.set object. comp.def is a string with the format "tumor-control" to define a comparison of expression between samples labeled as "tumor" and samples labeled as "control". The gset.collection (optional) belongs to GeneSetCollection class. See details.

dex.setB

A list object that defines the other two-group comparison. It has the same structure as dex.setA.

map.data

a list object with 3 components that defines how probe-sets from dex.setA are matched with probe-sets from dex.setB. The probe.map component is a data.frame with each row defining how probe-sets are matched across the pair of two-group comparisons. The components map.Aprobe.col and map.Bprobe.col give the names or numeric index of the column containing probe-set identifiers in dex.setA and dex.setB respectively.

min.nperms

minimum number of permutations for adaptive permutation testing of gene-set level results, default is set to 100. Adaptive permutation testing permutes data until observing min.nperms statistics that exceed the observed statistic in absolute value or until max.nperms permutations are performed. Adaptive permutation testing greatly reduces computational effort for permutation analysis in many genomics applications. See [2] (Pounds et al. 2011) for more details.

max.nperms

maximum number of permutations for adaptive permutation testing of gene-set level results and fixed total number of permutations for classical permutation testing of probe-set level results and genome-wide agreement of differential expression, default is set to 10000.

Details

Object express.set belongs to ExpressionSet class. express.set includes two components: exprs: a matrix of gene expression data with row of probe-sets and columns of subjects. pData: a data frame with each row representing a sample and two columns are sample ID and sample group label.

gset.collection component contains a GeneSetCollection object defined in the Bioconductor package GSEABase. The gset.collection object must be the same identifiers for probe-sets as those used in expression matrix in express.set.

Value

A list object with the following components

dex.compA

this string echoes the comp.def component of dex.setA that defines the definition for two-group comparison "A", for example "human tumor-human control"

dex.compB

this string echoes the comp.def component of dex.setB that defines the definition for two-group comparison "B". for example "mouse tumor-mouse control"

gwide.agdex.res

a data.frame with the agreement statistics, p-values, and number of permutations for genome-wide agreement of differential expression analysis.

gset.res

a data.frame with results of gene-set differential expression analysis for each comparison and gene-set agreement of differential expression analysis results.

meta.dex.res

a data.frame with results for probe-sets matched across comparisons "A" and "B". The data.frame includes the differential expression statistic and p-value from each comparison and the meta-analysis z-statistic and p-value for differential expression.

dex.resA

a data.frame with differential expression analysis results for individual probe-sets for two-group comparison "A". The data.frame includes the probe-set identifier, difference of mean log-expression statistic, and the p-value.

dex.resB

a data.frame with the same structure as dex.resA that gives the results for two-group comparison "B".

dex.asgnA

a data.frame that echoes the group label assignments for comparison "A"

dex.asgnB

a data.frame that echoes the group label assignments for comparison "B"

gset.listA

a data.frame with gene-set lists for comparison "A". Each row indicates an assignment of a probe-set identifier to a gene-set.

gset.listB

a data.frame with gene-set lists for comparison "B".

gset.list.agdex

a data.frame that assigns probe-set pairs (probe-sets from comparisons A and B that query the same gene) to gene-sets for gene-set agreement of differential expression analysis.

Author(s)

Stan Pounds <stanley.pounds@stjude.org; Cuilan Lani Gao <cuilan.gao@stjude.org>

References

1. S.Pounds, C.Gao, R.Johnson, K.Wright, H.Poppleton, D.Finkelstein, S.leary and R.Gilbertson (2011). A procedure to statistically evaluate agreement of differential expression for cross-species genomics. Bioinformatics doi: 10.1093/bioinformatics/btr362(2011).

2. S.Pounds, X.Cao, C.Cheng, J.Yang, D. Campana, WE.Evans, C-H.Pui, and MV. Relling(2011) Integrated Analysis of Pharmacokinetic, Clinical, and SNP Microarray Data using Projection onto the Most Interesting Statistical Evidence with Adaptive Permutation Testing, International Journal of Data Mining and Bioinformatics, 5:143-157.

See Also

ExpressionSet class: ExpressionSet.

GeneSetCollection class: GeneSetCollection.

human.data; mouse.data; map.data; gset.data read.agdex.result; write.agdex.result; agdex.scatterplot; get.gset.result.details; write.agdex.gset.details; read.agdex.gset.details

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 # load data
 data(human.data)
 data(mouse.data)
 data(map.data)   
 data(gset.data)      
         
 # make dex.set object for human data
 dex.set.human <- make.dex.set.object(human.data,
                                      comp.var=2,
                                      comp.def="human.tumor.typeD-other.human.tumors",
                                      gset.collection=gset.data)
 # make dex.set object for mouse data
 dex.set.mouse <- make.dex.set.object(mouse.data,
                                      comp.var=2,
                                      comp.def="mouse.tumor-mouse.control",
                                      gset.collection=NULL)
                   
 # call agdex routine
 res <- agdex(dex.set.human,dex.set.mouse,map.data,min.nperms=5,max.nperms=10)
 
 # see visualization result of the whole genome
 agdex.scatterplot(res, gset.id=NULL)
 
 # see visualization result of a specific gene-set
 agdex.scatterplot(res, gset.id="DNA_CATABOLIC_PROCESS")
 
 # get the gene-set result of a specific gene-set
 gset.detail <- get.gset.result.details(res, gset.ids="DNA_CATABOLIC_PROCESS", alpha=0.01)
  

Example output

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: GSEABase
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: XML
Loading required package: graph

Attaching package:graphThe following object is masked frompackage:XML:

    addNode

Preparing differential expression data sets (dex.setA and dex.setB): Mon Feb 22 21:47:41 2021 
Preparing data that maps probe set IDs across experiments: Mon Feb 22 21:47:41 2021 
Computing statistics for observed data: Mon Feb 22 21:47:41 2021 
Mapping gene-sets across experiments: Mon Feb 22 21:47:41 2021 
Computing time for observed statistics (in seconds): 
0.003 0 0.003 0 0 
Permuting experiment A data 10 times: Mon Feb 22 21:47:41 2021 
Computing time for permutation analysis of data set A (in seconds): 
0.015 0 0.015 0 0 
Permuting experiment B data 10 times: Mon Feb 22 21:47:41 2021 
Computing time for permutation analysis of data set B (in seconds): 
0.013 0 0.014 0 0 
Packaging Result Object: Mon Feb 22 21:47:41 2021 
Done: Mon Feb 22 21:47:41 2021 

AGDEX documentation built on Nov. 8, 2020, 8:32 p.m.