clean_alignment: function to discard very poorly conserved regions from a...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/functions.R

Description

very poorly conserved regions are likely to be regions that are either not homologous between the sequences being considered (and so do not add any phylogenetic signal), or are homologous but are so diverged that they are very difficult to align accurately (and so may add noise to the phylogenetic analysis, and decrease the accuracy of the inferred tree) this function takes an alignment in phylip format made by an alignment software like clustal omega and clean the alignment based on a defined percentage of non-gap positions and a defined percentage of sequence identity between all sequences considered for phylogenetic analysis

Usage

1
clean_alignment(alignment, minpcnongap, minpcid)

Arguments

alignment

object of type alignment in phylip format with the aligned sequences to be checked for similarity and cleaned. A file in format phylip is generated by an alignment software from a .fasta file containing the sequences and loaded to the R environment

minpcnongap

integer for the desired minimal percentage of non-gap positions between alignments for each position being analysed

minpcid

integer for the desired minimal percentage of sequence identity between alignments for each position being analysed

Value

this function returns a sequences alignment in phylip format

Author(s)

gerardo esteban antonicelli

See Also

'retrieve_seqs' 'print_alignment' 'load_alignment' 'make_tree' 'max_parsimony'

Examples

1
2
3
data(phylipProt)
cleaned_phylipProt <- clean_alignment(phylipProt, 30, 30)
print_alignment(cleaned_phylipProt)

geantonicelli/firstPackage documentation built on Aug. 24, 2020, 3:14 a.m.