generate_ortholog_tables_all: Generate ortholog tables by gene locus and splice varaint for...

View source: R/generate_ortholog_tables_all.R

generate_ortholog_tables_allR Documentation

Generate ortholog tables by gene locus and splice varaint for a set of species

Description

Given an input folder storing dNdS tables generated by dNdS and annotation files stored in an annotation folder for the query (one annotation file) and subject species in gtf or gff file format, this function selects the best BLAST hit to represent either a gene locus (e.g. the splice variant of the gene locus with lowest e-value) or the best BLAST hit for a splice varaint.

Usage

generate_ortholog_tables_all(
  dNdS_folder,
  annotation_file_query,
  annotation_folder_subject,
  output_folder,
  output_type = "gene_locus",
  format = c("gtf", "gtf")
)

Arguments

dNdS_folder

file path to folder storing a dNdS tables generated with dNdS and stored conform with read.dnds.tbl.

annotation_file_query

file path to the annotation file of the query species in gtf or gff file format.

annotation_folder_subject

file path to a folder storing the annotation files of the subject species in gtf or gff file format.

output_folder

file path to a folder in which orthologs tables shall be stored.

output_type

type of ortholog table that shall be printed out (or stored in a variable). Available options are:

  • output_type = "gene_locus" (Default): find for each gene locus a representative splice variant that maximizes the sequence homology (in terms of smalles e-value and longest splice variant in case of same evalue) to the subject gene locus and its representative splice variant. The output table contains only once representative splice variant per gene locus.

  • output_type = "splice_variant": for each homologous gene locus determine for each splice variant their respective splice variant homolog. he output table contains several splice variants and their homologous splice variants per gene locus.

format

a vector of length 2 storing the annotation file formats of the query annotation file and subject annotation file: either gtf or gff format. E.g. format = c("gtf","gtf").

Author(s)

Hajk-Georg Drost


HajkD/orthologr documentation built on Oct. 13, 2023, 12:11 a.m.