computeSpeciesSpecificAnnotationDiversityPerGroup: Computes the annotation (function) based Shannon Entropy per...

View source: R/family_funks.R

computeSpeciesSpecificAnnotationDiversityPerGroupR Documentation

Computes the annotation (function) based Shannon Entropy per gene group separate for each species. Note it is highly recommended to set options(mc.cores=detectCores())

Description

Computes the annotation (function) based Shannon Entropy per gene group separate for each species. Note it is highly recommended to set options(mc.cores=detectCores())

Usage

computeSpeciesSpecificAnnotationDiversityPerGroup(groups.df,
  group.col = "Tandem.Cluster", species.col = "Species",
  gene.col = "Gene", group.ids = unique(groups.df[, group.col]),
  species = unique(groups.df[, species.col]), gene.annos = all.ipr,
  annos.gene.col = 1, annos.anno.col = 2)

Arguments

groups.df

An instance of base::data.frame with at least three columns. One must hold the gene group identifiers, another the species names, and another the gene identifiers (accessions) belonging to the respective group and species.

group.col

The column name of 'groups.df' in which to find the gene group identifiers. Default is 'Tandem.Cluster'

species.col

The column name of 'groups.df' in which to find the species names. Must intersect with 'species' argument. Default is 'Species'.

gene.col

The column name of 'groups.df' in which to find the gene identifiers (accessions). Default is 'Gene'.

group.ids

all group identifiers for which to compute the species specific annotation based Shannon Entropy. Default is all distinct gene groups found in 'groups.df'.

species

a character vector of species names present in 'groups.df' 'species.col'. Species not present in this argument will not be considered. Default is all distinct species names found in 'groups.df'.

gene.annos

The data.frame holding the annotations for the genes in 'gene.accs'. Default is all available InterPro annotations expected to be found in 'all.ipr'

annos.gene.col

the column of 'gene.annos' in which to lookup the gene identifiers or gene accessions. Default is 1

annos.anno.col

the column of 'gene.annos' in which to lookup the function annotation for the genes in 'gene.accs'. Default is 2

Value

An instance of base::data.frame with several columns. The first holds the gene group identifiers obtained from 'groups.df' and one numeric column for each species found in 'species'. The later columns hold the species specific annotation entropies for the respective gene groups; NA values where no genes of that species were found to be members of the respective gene group.


asishallab/GeneFamilies documentation built on May 22, 2023, 11:30 a.m.