findCommonAncestors: Find common ancestors

View source: R/findCommonAncestors.R

findCommonAncestorsR Documentation

Find common ancestors

Description

Given a set of ontology terms, find their latest common ancestors based on the term hierarchy.

Usage

findCommonAncestors(..., g, remove.self = TRUE, descriptions = NULL)

Arguments

...

One or more (possibly named) character vectors containing ontology terms.

g

A graph object containing the hierarchy of all ontology terms.

remove.self

Logical scalar indicating whether to ignore ancestors containing only a single term (themselves).

descriptions

Named character vector containing plain-English descriptions for each term. Names should be the term identifier while the values are the descriptions.

Details

This function identifies all terms in g that are the latest common ancestor (LCA) of any subset of terms in .... An LCA is one that has no children that have the exact same set of descendent terms in ..., i.e., it is the most specific term for that set of observed descendents. Knowing the LCA is useful for deciding how terms should be rolled up to broader definitions in downstream applications, usually when the exact terms in ... are too specific for practical use.

The descendents DataFrame in each row of the output describes the descendents for each LCA, stratified by their presence or absence in each entry of .... This is particularly useful for seeing how different sets of terms would be aggregated into broader terms, e.g., when harmonizing annotation from different datasets or studies. Note that any names for ... will be reflected in the columns of the DataFrame for each LCA.

Value

A DataFrame where each row corresponds to a common ancestor term. This contains the columns number, the number of descendent terms across all vectors in ...; and descendents, a List of DataFrames containing the identities of the descendents. It may also contain the column description, containing the description for each term.

Author(s)

Aaron Lun

Examples

co <- getOnto("cellOnto")

# TODO: wrap in utility function.
parents <- co$parents
self <- rep(names(parents), lengths(parents))
library(igraph)
g <- make_graph(rbind(unlist(parents), self))

# Selecting random terms:
LCA <- ontoProc:::findCommonAncestors(A=sample(names(V(g)), 20),
   B=sample(names(V(g)), 20), g=g)

LCA[1,]
LCA[1,"descendents"][[1]]


vjcitn/ontoProc documentation built on Oct. 12, 2024, 4:35 p.m.