lineagePath: Resolving lineage paths using SNP

View source: R/lineagePath.R

lineagePathR Documentation

Resolving lineage paths using SNP

Description

lineagePath finds the lineages of a phylogenetic tree providing the corresponding sequence alignment. This is done by finding 'major SNPs' which usually accumulate along the evolutionary pathways.

sneakPeek is intended to plot 'similarity' (actually the least percentage of 'major SNP') as a threshold against number of output lineagePath. This plot is intended to give user a rough view about how many lineages they could expect from the 'similarity' threshold in the function lineagePath. The number of lineagePath is preferably not be too many or too few. The result excludes where the number of lineagePath is greater than number of tips divided by 20 or user-defined maxPath. The zero lineagePath result will also be excluded.

When used on the return of sneakPeek, a lineagePath with the closest similarity will be retrieved from the returned value.

similarity has no effect when using on paraFixSites object

Usage

lineagePath(tree, similarity, ...)

## S3 method for class 'phylo'
lineagePath(
  tree,
  similarity = NULL,
  alignment = NULL,
  seqType = c("AA", "DNA", "RNA"),
  reference = NULL,
  gapChar = "-",
  minSkipSize = NULL,
  ...
)

## S3 method for class 'treedata'
lineagePath(tree, ...)

## S3 method for class 'phyMSAmatched'
lineagePath(
  tree,
  similarity = NULL,
  simMatrix = NULL,
  forbidTrivial = TRUE,
  ...
)

sneakPeek(tree, step = 9, maxPath = NULL, minPath = 0, makePlot = TRUE)

## S3 method for class 'sneakPeekedPaths'
lineagePath(tree, similarity, ...)

## S3 method for class 'paraFixSites'
lineagePath(tree, similarity = NULL, ...)

Arguments

tree

The return from addMSA or sneakPeek function.

similarity

The parameter for identifying phylogenetic pathway using SNP. If provided as fraction between 0 and 1, then the minimum number of SNP will be total tips times Nmin. If provided as integer greater than 1, the minimum number will be Nmin.

...

Other arguments.

alignment

An alignment object. This commonly can be from sequence parsing function in the seqinr package. Sequence names in the alignment should include all tip.label in the tree

seqType

The type of the sequence in the alignment file. The default is "AA" for amino acid. The other options are "DNA" and "RNA".

reference

Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering

gapChar

The character to indicate gap. The numbering will skip the gapChar for the reference sequence.

minSkipSize

The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8.

simMatrix

Deprecated and will not have effect.

forbidTrivial

Does not allow trivial trimming.

step

the 'similarity' window for calculating and plotting. To better see the impact of threshold on path number. The default is 10.

maxPath

maximum number of path to return show in the plot. The number of path in the raw tree can be far greater than trimmed tree. To better see the impact of threshold on path number. This is preferably specified. The default is one 20th of tree tip number.

minPath

minimum number of path to return show in the plot. To better see the impact of threshold on path number. The default is 1.

makePlot

Whether make a plot when return.

Value

Lineage path represent by node number.

sneakPeek return the similarity threhold against number of lineagePath. There will be a simple dot plot between threshold and path number if makePlot is TRUE.

Examples

data('zikv_tree')
data('zikv_align')
tree <- addMSA(zikv_tree, alignment = zikv_align)
lineagePath(tree)
sneakPeek(tree, step = 3)
x <- sneakPeek(tree, step = 3)
lineagePath(x, similarity = 0.05)

wuaipinglab/sitePath documentation built on Sept. 26, 2022, 10:16 p.m.