orthologsFromBLATpslTable: Parses 'psl' default BLAT output formatted files to infer...

View source: R/ortholog_funks.R

orthologsFromBLATpslTableR Documentation

Parses 'psl' default BLAT output formatted files to infer orthologs between species. Uses PERL regular expressions to identify which gene identifiers belongs to which species. Hence by using two such regular expressions orthologs between two species can be extracted, even though the BLAT output file contains similarity pairs from more than two species. Note that orthology is inferred based on reciprocal best search results.

Description

Parses 'psl' default BLAT output formatted files to infer orthologs between species. Uses PERL regular expressions to identify which gene identifiers belongs to which species. Hence by using two such regular expressions orthologs between two species can be extracted, even though the BLAT output file contains similarity pairs from more than two species. Note that orthology is inferred based on reciprocal best search results.

Usage

orthologsFromBLATpslTable(blat.df, query.col = 10, target.col = 14,
  matching.bases.col = 1, qSize.col = 11, tSize.col = 15,
  spec.regexs = c("CAHR\\d+(\\.\\d)?",
  "AT[0-9MC]G\\d+(\\.\\d)?"))

Arguments

blat.df

a data frame holding the output of BLAT in psl format. Se function readBlatPSLoutput(...) for more details

query.col

the number of the column in which to lookup the query sequence's name

target.col

the number of the column in which to lookup the target sequence's name

matching.bases.col

the number of the column in which to lookup the number of nucleotide bases matching between query and target

qSize.col

the number of the column in which to lookup the query sequence's size

tSize.col

the number of the column in which to lookup the target sequence's size

spec.regexs

a character vector containg two regular expressions matching gene identifiers from each of the two species orthologies are to be found for

Value

A data frame of two columns holding the gene identifiers of the orthologous gene pairs. Note, that for reasons of easing lookup inverse pairs are also contained.


asishallab/GeneFamilies documentation built on May 22, 2023, 11:30 a.m.