computeNwOverrep: Perform overrepresentation analysis on the input network

Usage Arguments Details Value Author(s) References See Also Examples

Usage

1
computeNwOverrep(edgelist, nodelist, annotation, internalid, size)

Arguments

edgelist

a data frame of edges contains at least a source column (1st column) and a target column (2nd column).

nodelist

a data frame of nodes contains at least two columns of node attributes. 1st column is id or neo4j id, 2nd column is id or grinn id. The 2nd column is used for Mesh annotation.

annotation

a string specifying the annotation type e.g. pathway (default) and mesh. Pathway annotation requires the database. Mesh annotation doesn't require the database but it is available for PubChem compounds only.

internalid

a logical value indicating whether the network nodes are neo4j ids, if TRUE (default). If not, the network nodes are expected to be any ids. See details and see convertId for how to convert ids. It has no effect on Mesh annotation.

size

a numeric vector specifying the minimum number of members in each annotation term to be used in the analysis. Default is 3.

Details

The database uses two id systems. The neo4j id is a numeric, internal id automatically generated by the database system. The grinn id (gid) is an id system of Grinn database that uses main ids of standard resources i.e. ENSEMBL for genes (e.g.ENSG00000139618), UniProt for proteins (e.g.P0C9J6), PubChem CID for compounds (e.g.5793), KEGG for pathways (e.g.hsa00010).

Value

list of data frame of nodes, edges, overrepresentation and pairs. The pairs data frame contains annotation pairs. The data frame of overrepresentation contains the following components:

rank = rank sort by p adj

id = annotation id or annotation neo4j id

gid = annotation id or annotation grinn id

nodename = annotation name

nodelabel = annotation type

nodexref = cross references

p_combine = combined-raw p-values, if there are more than one node types

p_combine_adj = adjusted combined p-values, if there are more than one node types

p = raw p-values

p_adj = adjusted p-values

no_of_entities = number of input entities in each annotation term

annotation_size = total number of entities in each annotation term from the database

background_size = total number of annotated entities in the database

member = list of entity members of the annotation term

Return list of empty data frame if error or found nothing.

Author(s)

Kwanjeera W kwanich@ucdavis.edu

References

Johnson NL., Kotz S., and Kemp AW. (1992) Univariate Discrete Distributions, Second Edition. New York: Wiley.

Fisher R. (1932) Statistical methods for research workers. Oliver and Boyd, Edinburgh.

See Also

phyper, p.adjust,pchisq

Examples

1
2
#simnw <- computeSimilarity(c(1110,10413,196,51,311,43,764,790)) #compute similarity network for given pubchem compounds
#result <- computeNwOverrep(simnw$edges, simnw$nodes, annotation="mesh", internalid = FALSE)

kwanjeeraw/metabox documentation built on May 20, 2019, 7:07 p.m.