getCoreGene: Identify core genes for a list of selected taxa

View source: R/identifyCoreGene.R

getCoreGeneR Documentation

Identify core genes for a list of selected taxa

Description

Identify core genes for a list of selected (super)taxa. The identified core genes must be present in at least a certain proportion of species in each selected (super)taxon (identified via percentCutoff) and that criteria must be fullfilled for a certain percentage of selected taxa or all of them (determined via coreCoverage).

Usage

getCoreGene(rankName, taxaCore = c("none"), profileDt, taxaCount,
    var1Cutoff = c(0, 1), var2Cutoff = c(0, 1), percentCutoff = c(0, 1),
    coreCoverage = 100, taxDB = NULL)

Arguments

rankName

working taxonomy rank (e.g. "species", "genus", "family")

taxaCore

list of selected taxon names

profileDt

dataframe contains the full processed phylogenetic profiles (see ?fullProcessedProfile or ?parseInfoProfile)

taxaCount

dataframe counting present taxa in each supertaxon

var1Cutoff

cutoff for var1. Default = c(0, 1).

var2Cutoff

cutoff for var2. Default = c(0, 1).

percentCutoff

cutoff for percentage of species present in each supertaxon. Default = c(0, 1).

coreCoverage

the least percentage of selected taxa should be considered. Default = 1.

taxDB

Path to the taxonomy DB files

Value

A list of identified core genes.

Author(s)

Vinh Tran tran@bio.uni-frankfurt.de

See Also

parseInfoProfile for creating a full processed profile dataframe

Examples

library(dplyr)
data("fullProcessedProfile", package="PhyloProfile")
rankName <- "class"
refTaxon <- "Mammalia"
taxaCore <- c("Mammalia", "Saccharomycetes", "Insecta")
profileDt <- fullProcessedProfile
taxonIDs <- levels(as.factor(fullProcessedProfile$ncbiID))
sortedInputTaxa <- sortInputTaxa(
    taxonIDs, rankName, refTaxon, NULL, NULL
)
taxaCount <- sortedInputTaxa %>% dplyr::count(supertaxon)
var1Cutoff <- c(0.75, 1.0)
var2Cutoff <- c(0.75, 1.0)
percentCutoff <- c(0.0, 1.0)
coreCoverage <- 100
getCoreGene(
    rankName,
    taxaCore,
    profileDt,
    taxaCount,
    var1Cutoff, var2Cutoff,
    percentCutoff, coreCoverage
)

BIONF/PhyloProfile documentation built on Dec. 18, 2024, 7:33 a.m.