update_phylota: Update Phylota Clusters

Description Usage Arguments Value Author(s) References Examples

View source: R/update_phylota.R

Description

This function updates a set of phylota clusters using sequences from genbank.

Make sure to have blast installed in your computer (also in your path). For windows users follow the steps outlined in the following link:

https://dmcase.blogspot.com.br/2014/02/how-to-install-blast-on-windows.html

Instructions for MAC users:

https://gist.github.com/knmkr/5393474

In short, this function first download all the available orthologous clusters for a given lineage. Then, it looks for non-overlapping species between the sampled (i.e., PhyloTa) and newly published sequences (i.e., genbank). This function also allows the inclussion of outgroup species.

Each new sequence is then blastes agains each orthologous cluster and merged when a significant match is found. Next, taxonomy is corrected and sampling summaries are exported.

Additional arguments are provided for users interested in performing multiple sequence alignment or using aliscore.

Usage

1
update_phylota(lineage,nsamples,genes,MSA,ALI,outgroup,correct_db,delete_all,c_directory)

Arguments

lineage

String: A given lineage (e.g. "Sphyraena")

nsamples

numeric: If sequence names are not provided, type the number of sequences to be used when assesing the sequence identity (i.e., more sequence will take more time to process, but the likelihood of ID is higher. I suggest using at least 5)

genes

String: Which genes should the pipeline look for? Use either previous studies to guide yourself or look at PhyloTa database

MSA

Logical: Align sequences?

ALI

Logical: Run Aliscore (only if MSA=T)?

outgroup

String: A vector of strings containing the names of outgroup species

correct_db

Logical: Whether an additional correction step is conducted after the new sequences are included. Subspecies and sp. are removed from the resulting files

delete_all

Logical: Delete all intermediate files

c_directory

Logical: Should the function work in your current working directory?

Value

df

A dataframe containing all the species that were included

Author(s)

Cristian Roman-Palacios

References

Sanderson, M. J., Boss, D., Chen, D., Cranston, K. A., & Wehe, A. (2008). The PhyLoTA Browser: processing GenBank for molecular phylogenetics research. Systematic Biology, 57(3), 335-346.

Examples

1
2
3
4
## Not run: 
update_phylota("Alcedinidae",genes=c("RAG-1","ND2"),"Actenoides hombroni",ALI=T,MSA=T,c_directory=T)

## End(Not run)

cromanpa94/rPHYLOTA documentation built on Nov. 4, 2019, 9:18 a.m.