knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

fishphylo

The goal of fishphylo is to import sequence data from GenBank and from Fasta files and build basic phylogenetic trees.

Installation

You can install the development version from GitHub with:

``` {r, eval = TRUE, results='hide'}

install.packages("devtools")

devtools::install_github("jurenoult/fishphylo")

# Use
## import sequences from GenBank and export a .fasta file

Load this project as a library.
```r
library(fishphylo)

You will need to install the package msa that is available on Bioconductor. You can install it using the following commands:

if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("msa")
library(msa)

Now, specify the names of taxa and the gene for which you want to download DNA sequences from Genebank

ls_tax_gen <- cb_taxa_gene(c("Pomatoschistus lozanoi","Pomatoschistus adriaticus"),"COI")
ls_tax_gen

Write a fasta file with the sequences.

ls_fasta <- build_fasta(ls_tax_gen)
write_fasta(ls_fasta,"pomlozANDpomadri_COI_seqs.fasta")

The file has been saved in :file_folder: data.

Combine sequences from two fasta files

Read a fasta file generated by write_fasta function (from GenBank), display accession numbers and species name, keep only accessions specified by a vector of indices:

fas1 <- read_fasta("pomlozANDpomadri_COI_seqs.fasta")
disp_access(fas1) # display accession numbers and species name
keep <- c(1:4,9) # selection accessions 1 to 4 and accession 9
fas1 <- fas1[keep] # create a fasta object with only the selection accessions

Read a fasta file located in the data folder (e.g., sequences sent by Laure) and combine the accessions with those of the first fasta

fas2 <- read_fasta("20201104_COI_fish.fasta")
fas <- c(fas1,fas2)
write_fasta(ls_fasta,"combined_seq.fasta") # it is necessary to write the new list of accessions in a fasta file

Build a simple NJ phylogenetic tree

First, we need to align the sequences. This function works only from a fasta file located in :file_folder: data

fas <- align_fasta("combined_seq.fasta")

Plot a NJ tree based on JC69 distance matrix, with bootstraps

NJ_tree <- build_MLtree(fas,"example.tre")
#plot_tree(NJ_tree)


jurenoult/fishphylo documentation built on Jan. 1, 2021, 7:12 a.m.