parseInfoProfile: Parsing info for phylogenetic profiles

View source: R/parsePhyloProfile.R

parseInfoProfileR Documentation

Parsing info for phylogenetic profiles

Description

Creating main dataframe for the input phylogenetic profiles based on selected input taxonomy level (e.g. strain, species) and reference taxon. The output contains the number of paralogs, the max/min/mean/median of VAR1 and VAR2.

Usage

parseInfoProfile(inputDf, sortedInputTaxa, taxaCount, coorthoCOMax)

Arguments

inputDf

input profiles in long format

sortedInputTaxa

sorted taxonomy data for the input taxa (check sortInputTaxa())

taxaCount

dataframe counting present taxa in each supertaxon

coorthoCOMax

maximum number of co-orthologs allowed

Value

A dataframe contains all info for the input phylogenetic profiles. This full processed profile that is required for several profiling analyses e.g. estimation of gene age (?estimateGeneAge) or identification of core gene (?getCoreGene).

Author(s)

Vinh Tran tran@bio.uni-frankfurt.de

See Also

createLongMatrix, sortInputTaxa, calcPresSpec, mainLongRaw

Examples

library(dplyr)
data("mainLongRaw", package="PhyloProfile")
taxonIDs <- getInputTaxaID(mainLongRaw)
sortedInputTaxa <- sortInputTaxa(
    taxonIDs, "class", "Mammalia", NULL, NULL
)
taxaCount <- sortedInputTaxa %>% dplyr::group_by(supertaxon) %>%
    summarise(n = n(), .groups = "drop")
coorthoCOMax <- 999
parseInfoProfile(
    mainLongRaw, sortedInputTaxa, taxaCount, coorthoCOMax
)

BIONF/PhyloProfile documentation built on Dec. 18, 2024, 7:33 a.m.