read.vhica | R Documentation |
The VHICA method relies on two sources of information: (i) the divergence between sequences, and (ii) the codon usage bias. This function reads two data files and creates an object of class vhica
that can be further explored by plot.vhica
and image.vhica
. Input can be either (1) two vectors of fasta file names (one for the genes, one for the putatively transfered genes), or (2) already processed files containing codon usage bias and divergence data (see Details).
read.vhica(gene.fasta=NULL, target.fasta=NULL,
cb.filename=NULL, div.filename=NULL,
reference = "Gene", divergence = "dS",
CUB.method="ENC", div.method="LWL85", div.pairwise=TRUE,
div.max.lim=3, species.sep="_", gene.sep=".", family.sep=".", ...)
gene.fasta |
Sequence files (FASTA format) containing the aligned sequences (respecting the translation phase) for all species of the reference genes. |
target.fasta |
Sequence files (FASTA format) containing the aligned sequence of the putatively transfered genes. |
cb.filename |
File name for the codon usage bias data. If FASTA files are provided, this file will be created. |
div.filename |
File name for the divergence data. If FASTA files are provided, this file will be created. |
reference |
Name of the reference type in the codon usage file. Default is "Gene". |
divergence |
Name of the divergence column in the divergence file. Default is "dS". |
CUB.method |
Method to be used for Codon Usage Bias calculation (see |
div.method |
Method to be used for divergence calculation (see |
div.pairwise |
Whether divergence should be calculated from the whole alignment of between pairs of sequences
(see |
div.max.lim |
Maximum divergence score. Estimated divergence much larger than 100% are likely to be problematic and should not be considered. |
species.sep |
Separator for species (or equivalent) labels in sequence names. Any character string following this separator will be disregarded – be careful about potential duplicates. |
gene.sep |
Separator for gene names from gene sequence files. |
family.sep |
Separator for target sequence sub-families. |
... |
Further parameters for the internal function |
Details about CUB and divergence calculations can be found in CUB
and div
. If CUB and/or divergence need to be calculated by an external program, it is possible to provide them in the following format:
Codon usage bias Example of data file:
Type sp1 sp2 sp3 CG4231 Gene 42.3 51.1 47.2 CG2214 Gene 47.2 44.9 53.2 Pelem1 TE 36.2 47.0 44.4 ...
Row names (or first column)sequence index
Type whether the sequence is a reference (default: Gene) or a focal sequence (transposable element, ...)
Following columns a measurement of codon bias (ENC, CBI...) for every species
Divergence Example of data file:
seq dS sp1 sp2 CG4231 0.84 Dmel Dsim CG4231 0.46 Dmel Dana CG4231 0.58 Dsim Dana CG2214 0.10 Dmel Dsim ...
First column (or row names): sequence index
Second column: divergence measurement
Columns 3 and 4: the pair of species on which the divergence is calculated
Row names and Col names are allowed but disregarded
The function returns an object of class vhica
, a list containing:
cbias: A codon bias array
div: The divergence matrix
reg: The result of all pairwise regressions
reference: The reference
option
target: The sequence type that is not the reference
divergence: The divergence
option
family.sep: The character used to indicate TE sub-families
Implementation: Arnaud Le Rouzic
Scientists who designed the method: Gabriel Wallau, Aurelie Hua-Van, Arnaud Le Rouzic.
Gabriel Luz Wallau, Arnaud Le Rouzic, Pierre Capy, Elgion Loreto, Aurelie Hua-Van. VHICA: A new method to discriminate between vertical and horizontal transposon transfer: application to the mariner family within Drosophila. Molecular biology and evolution 33 (4), 1094-1109.
plot.vhica
, image.vhica
, CUB
, div
file.cb <- system.file("extdata", "mini-cbias.txt", package="vhica")
file.div <- system.file("extdata", "mini-div.txt", package="vhica")
file.tree <- if(require("ape")) system.file("extdata", "phylo.nwk", package="vhica") else NULL
vc <- read.vhica(cb.filename=file.cb, div.filename=file.div)
plot(vc, "dere", "dana")
image(vc, "mellifera:6", treefile=file.tree, skip.void=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.