Most of the services provided with the Phenoscape Knowledgebase (KB) API return data in JSON format, plain text (usually tab-delimited), and NeXML. This package facilitates interfacing with the Phenoscape Knowledgebase for searching ontology terms, retrieving term info, and querying data matrices.

## Set the paths for cache and figure
library(knitr)
basename <- gsub(".Rmd", "", knitr:::knit_concord$get('infile')) 
opts_chunk$set(fig.path = paste("figure/", basename, "-", sep=""))
opts_knit$set(upload.fun = imgur_upload)
opts_chunk$set(tidy=FALSE, warning=FALSE, message=FALSE, comment = NA, verbose = TRUE)

Installation

The development version of RPhenoscape is available on Github. It has not yet been released to [CRAN]. To install RPhenoscape from Github, use the install_github() function in the remotes package (which can be installed from CRAN using install.packages()). See the package README for details.

Once installed, the package can be loaded ("attached") as any other R package:

library(rphenoscape)

Character Matrix via OntoTrace

Use OntoTrace to obtain a character matrix of inferred and author-asserted presence/absence associations for a taxonomic clade and anatomical region of interest.

The ontotrace endpoint in the Phenoscape KB API returns the presence/absence character matrix in NeXML format. The first step is to get the NeXML object using get_ontotrace_data method.

nex <- get_ontotrace_data(taxon = c("Ictalurus", "Ameiurus"), entity = "fin spine")

The result is an object of class nexml defined in the RNeXML package. The object mirrors the structure of a NeXML file, and can be inspected accordingly. Note that although generically NeXML files can contain multiple OTUs blocks and multiple characters blocks, NeXML files generated by the KB's OntoTrace API contain only one. This means we can, for example, inspect the number of taxa and characters as follows:

# number of taxa in the first (and only) OTUs block
length(nex@otus[[1]]@otu)
# number of characters in the first (and only) characters block
length(nex@characters[[1]]@format@char)

More details on the [nexml object] can be found in the RNeXML documentation.

Then retrieve wanted information from the NeXML object.
Get character matrix:

(m <- get_char_matrix(nex))

The character matrix can be integrated with other data, such as meta data which include taxon identifiers, character identifiers, etc. Get meta data:

(meta <- get_char_matrix_meta(nex))

Character Matrices for Studies

To obtain the character matrices for studies published for a taxonomic clade and anatomical region of interest.

First step is to retrieve the list of studies given a taxonomic clade and anatomical structures (returned as data.frame).

(slist <- get_studies(taxon = "Ictalurus australis", entity = "fin"))

Based off the study ids retained from previous step, get the evolutionary character matrix for each study id (in NeXML-format) using get_study_data.

(nex_list <- get_study_data(slist$id))

From the list of NeXML objects, retrieve the character matrices.

study_matrix <- lapply(nex_list, function(nex) get_char_matrix(nex, otus_id = FALSE, states_as_labels = TRUE))
study_matrix[[1]][1:5, 1:5]

Each character matrix can be integrated with other data, such as meta data which include taxon identifiers, character identifiers, etc. Get meta data:

study_metas <- lapply(nex_list, function(nex) get_char_matrix_meta(nex))
study_metas[[1]]

Obtain Other Data

Subsetting a Matrix

A matrix obtained from Phenoscape can be subsetted (filtered) by taxonomic subgroup or anatomical part. For example, using is_descendant and is_ancestor methods, a matrix can be subsetted to a taxonomic subgroup that is the descendants/ancestors of a given taxon.

m # original character matrix
(is_desc <- is_descendant('Ictalurus', m$taxa))
m[is_desc, ] #subsetting to the descendants of Ictalurus

Term Search

Search for details for a given taxon:

taxon_info("Coralliozetus")

Search for details for a given anatomical structure:

anatomy_term_info("basihyal bone")

Miscellaneous methods:

Resolve a given term to its IRI:

get_term_iri("Coralliozetus", "vto")
get_term_iri("basihyal bone", "uberon")

Test if a taxon is extinct:

is_extinct("Fisherichthys")


xu-hong/rphenoscape documentation built on Jan. 28, 2024, 12:22 p.m.