updateExFasta: fDOG in runFdog or runFdogBusco will be run with option...

Description Usage Arguments Value Examples

View source: R/runFdog.R

Description

fDOG in runFdog or runFdogBusco will be run with option –fasoff. The extended fasta file of each core group will be merged and the merged file will be used as the input for fdogFAS. But the raw output from fDOG can not be used direct for the merging and must be processed. This function process the extended fasta file of each core group depent on the using score mode. For score mode 2 and 3 this function take the ortholog sequence from the extended fasta file and save it into a vector, the sequence of the references species will be appended followed into the vector. If it exists more than one ortholog sequence, the function will save the second ortholog sequence within the references sequence into the other vector. For score mode 1 is analog but instead of appending the references sequence into the vector, the function will append all training sequence of the core group into the vector. The function will returns a list, which contains the vector of the sequences. The number of the element of the list equal to the number of the orthologs, that fDOG founded for this core group. This function will be used as a modul in the function runFdog (not runFdogBusco).

Usage

1
updateExFasta(root, coreSet, fasta, coreGene, genomeName, refSpec, scoreMode)

Arguments

root

The path to the core directory, where the core set is stored within weight_dir, blast_dir, etc.

coreSet

The name of the interested core set. The core directory can contains more than one core set and the user must specify the interested core set. The core set will be stored in the folder core_orthologs in subfolder, specify them by the name of the subfolder

fasta

The extended fasta file of the core group in form of a vector

coreGene

The core group ID in the core set

genomeName

The genome ID of the interested genome, Exp:HUMAN@9696@3

refSpec

The genome ID of the references species

scoreMode

the mode determines the method to scoring the founded ortholog and how to classify them. Choices: 1, 2, 3, "busco"

Value

A list, which contains the vector of the sequences. The number of the element of the list equal to the number of the orthologs, that fDOG founded for this core group

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Create pseudo extended fasta
fasta <- c(
    ">530670|HUMAN@9606@1|HUMAN07070|1",
    "MGVNAVHWFRKGLRLHDNPALKECIQGADTIRCVY", ">530670|HUMAN@9606@3|Q16526|1",
    "MGVNAVHWFRKGLRLHDNPALKECIQGADTIRCVYILDP"
)

coreFolder <- system.file("extdata", "sample", package = "fCAT")
returnList <- updateExFasta(coreFolder, "test", fasta,
    "530670", "HUMAN@9606@3", "HUMAN@9606@1",
    scoreMode = 1
)

print(returnList)

giangnguyen0709/fCAT documentation built on Feb. 10, 2021, 4:31 a.m.