aba_enrich: Test genes for expression enrichment in human brain regions

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/aba_enrich.R

Description

Tests for enrichment of user defined candidate genes in the set of expressed protein-coding genes in different human brain regions. It integrates the expression of the candidate gene set (averaged across donors) and the structural information of the brain using an ontology, both provided by the Allen Brain Atlas project [1-4]. The statistical analysis is performed using the ontology enrichment software FUNC [5].

Usage

1
2
3
    aba_enrich(genes, dataset = 'adult', test = 'hyper', 
        cutoff_quantiles = seq(0.1, 0.9, 0.1), n_randsets = 1000, gene_len = FALSE,
        circ_chrom = FALSE, ref_genome = 'grch37', gene_coords = NULL, silent = FALSE)

Arguments

genes

A dataframe with gene-identifiers (Entrez-ID, Ensembl-ID or gene-symbol) in the first column and test-dependent additional columns:
If test='hyper' (default) a second column with 1 for candidate genes and 0 for background genes. If no background genes are defined, all remaining protein coding genes are used as background.
If test='wilcoxon' a second column with the score that is associated with each gene.
If test='binomial' two additional columns with two gene-associated integers.
If test='contingency' four additional columns with four gene-associated integers.
For test='hyper' the first column can also describe chromosomal regions ('chr:start-stop', e.g. '9:0-39200000').

dataset

'adult' for the microarray dataset of adult human brains; '5_stages' for RNA-seq expression data for different stages of the developing human brain, grouped into 5 developmental stages; 'dev_effect' for a developmental effect score. For details see browseVignettes("ABAData").

test

'hyper' (default) for the hypergeometric test, 'wilcoxon' for the Wilcoxon rank test, 'binomial' for the binomial test and 'contingency' for the 2x2-contingency table test (fisher's exact test or chi-square).

cutoff_quantiles

the FUNC enrichment analyses will be performed for the sets of expressed genes at given expression quantiles defined in this vector [0,1].

n_randsets

integer defining the number of random sets created to compute the FWER.

gene_len

logical. If test='hyper' the probability of a background gene to be chosen as a candidate gene in a random set is dependent on the gene length.

circ_chrom

logical. When genes defines chromosomal regions, circ_chrom=TRUE uses background regions from the same chromosome and allows randomly chosen blocks to overlap multiple background regions. Only if test='hyper'.

ref_genome

'grch37' (default) or 'grch38'. Defines the reference genome used when genomic regions are provided as input or when gene_len=TRUE.

gene_coords

optional data.frame() for custom gene coordinates, with four columns: gene-symbols (character), chromosome (character), start (integer), end (integer). When genomic regions are provided as input or when gene_len=TRUE, these custom gene coordinates are used instead of the integrated ones.

silent

logical. If TRUE all output to the screen except for warnings and errors is suppressed.

Details

For details please refer to browseVignettes("ABAEnrichment").

Value

A list with components

results

a dataframe with the FWERs from the enrichment analyses per brain region and age category, ordered by 'n_significant','min_FWER' and 'mean_FWER', 'age_category' and 'structure_id'. 'min_FWER' for example denotes the minimum FWER for expression enrichment of the candidate genes in this brain region across all expression cutoffs. 'n_significant' reports the number of cutoffs at which the FWER was below 0.05. 'FWERs' is a semicolon separated string with the single FWERs for all cutoffs. 'equivalent_structures' is a semicolon separated string that lists structures with identical expression data due to lack of independent expression measurements in all regions.

genes

a dataframe of the input genes, excluding those genes for which no expression data is available and which therefore were not included in the enrichment analysis.

cutoffs

a dataframe with the expression values that correspond to the requested cutoff quantiles.

Author(s)

Steffi Grote

References

[1] Hawrylycz, M.J. et al. (2012) An anatomically comprehensive atlas of the adult human brain transcriptome, Nature 489: 391-399. doi: 10.1038/nature11405
[2] Miller, J.A. et al. (2014) Transcriptional landscape of the prenatal human brain, Nature 508: 199-206. doi: 10.1038/nature13185
[3] Allen Institute for Brain Science. Allen Human Brain Atlas. Available from: http://human.brain-map.org/
[4] Allen Institute for Brain Science. BrainSpan Atlas of the Developing Human Brain. Available from: http://brainspan.org/
[5] Pruefer, K. et al. (2007) FUNC: A package for detecting significant associations between gene sets and ontological, BMC Bioinformatics 8: 41. doi: 10.1186/1471-2105-8-41

See Also

browseVignettes("ABAEnrichment")
browseVignettes("ABAData")
get_expression
plot_expression
get_name
get_id
get_sampled_substructures
get_superstructures
get_annotated_genes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#### Note that arguments 'cutoff_quantiles' and 'n_randsets' are reduced 
#### to lower computational time in the examples. 
#### Using the default values is recommended.

#### Perform an enrichment analysis for the developing brain
#### with defined background genes
#### and with random sets dependent on gene length
gene_ids = c('PENK', 'COCH', 'PDYN', 'CA12', 'SYNDIG1L', 'MME', 
    'ANO3', 'KCNJ6', 'ELAVL4', 'BEAN1', 'PVALB', 'EPN3', 'PAX2', 'FAB12')
is_candidate = rep(c(1,0),each=7)
genes = data.frame(gene_ids, is_candidate)
res = aba_enrich(genes, dataset='5_stages', cutoff_quantiles=c(0.5,0.9), 
    n_randsets=100, gene_len=TRUE)
## see results for the brain regions with highest enrichment
## for children (age_category 3)
fwers = res[[1]]
head(fwers[fwers$age_category==3,])
## see the input genes dataframe (only genes with expression data available) 
res[2]
## see the expression values that correspond to the requested cutoff quantiles
res[3]


# For more examples please refer to the package vignette.

sgrote/ABAEnrichment documentation built on July 15, 2019, 9:38 p.m.