alias_arbiter: Assign consistant ID to variable given various alias IDs from...

Description Usage Arguments Author(s)

Description

Assign a reference ID to a vector of IDs that may have alias values. An example, if ID values of "A", "B", "C" were given to three bacterial species, but "B" was later reclassified as "C", then "B" becomes a depreciated alias of "C". This function takes this information and changes the original vector of "A", "B", and "C", to "A", "C", and "C". Aliases can be supplied as a list of character or numeric vectors or as a character vector to split on the 'sep' value. IDs need to match RefIDs, but outputIDs can be another ID type, as long as they are indexed in the same order as RefIDs. Any ambiguous aliases will be removed, but will be messaged to the output. Any IDs not matching RefIDs or aliasIDs will be returned as "NA", unless "NULL" is provided to the option 'remove_absent_IDs'. For this option, the default value is 'FALSE' which will return 'NA' for unmatched IDs and 'TRUE' will remove all 'NA' values. Caution, this produces an output that is not the same length as the input if there are any unmatched or ambiguous IDs. The other option is 'NULL' to be supplied, which fills the 'NA' values with their original input, even if the alias was ambiguous.

This function creates a directed graph where all reference IDs are source nodes and all aliases are children from that source node. Alias IDs are screened for reference IDs and ambiguous IDs first, therefore ensuring the graph is a collection of small subgraphs with single sources and multiple sinks.

Usage

1
2
3
4
5
alias_arbiter(IDs, RefIDs, aliasIDs)
  
  alias_arbiter(
    IDs, RefIDs, aliasIDs, outputIDs, sep = NULL,
    remove_absent_IDs = FALSE, quiet = FALSE)

Arguments

IDs

character vector of IDs which need to be consistantly assigned. Must be same type as RefIDs.

RefIDs

character vector of IDs for reference, must be the same length and order as aliasIDs and, if included, outputIDs.

aliasIDs

list of character vectors giving aliases for given RefIDs, or character vector to be split for aliases (must specify the 'sep' argument with delimiter).

outputIDs

character or numeric vector for output. Indexed in the same order as the RefIDs and aliasIDs. If not given, RefIDs will be used as the output IDs

sep

character delimiter for aliasIDs if supplied in a character vector. Default is 'NULL'.

remove_absent_IDs

logical. Default behavior will change IDs not found in RefIDs or aliases to 'NA'. If set to TRUE, IDs not matching RefIDs or aliasIDs will be removed. Alternatively, if set to NULL, IDs not matching RefIDs or aliasIDs will remain as input ID values.

quiet

logical. If TRUE, message output will be silenced. Messages include notification of ambiguous aliases or aliases which belong to two or more reference IDs.

Author(s)

Christopher Nobles, Ph.D.


cnobles/spraphal documentation built on May 28, 2019, 7:35 p.m.