getcut.fun: Tzeng's Method: Finding the Best Number of Clusters

View source: R/tzeng-main.functions.forRDscoreTest.public.r

getcut.funR Documentation

Tzeng's Method: Finding the Best Number of Clusters

Description

For SNP sequences only, Tzeng's method (2005) uses an evolution approach to group haplotypes based on a deterministic transformation of haplotype frequency. This function find the best number of clusters based on Shannon information content.

Usage

getcut.fun(pp.org, nn, plot = 0)

Arguments

pp.org

frequency of haplotypes, sorted in decreasing order.

nn

number of haplotypes.

plot

illustrated in a plot.

Details

pp.org is summarized from X in haplo.post.prob, nn is equal to the number of rows of X.

This function is called by haplo.post.prob to determine the best guess of number of clusters. See Tzeng (2005) and Shannon (1948) for details.

Value

Return the best guess of number of clusters.

Author(s)

Jung-Ying Tzeng.

Maintain: Wei-Chen Chen wccsnow@gmail.com

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

Tzeng, J.Y. (2005) “Evolutionary-Based Grouping of Haplotypes in Association Analysis”, Genetics Epidemiology, 28, 220-231. https://www4.stat.ncsu.edu/~jytzeng/software.php

Shannon, C.E. (1948) “A mathematical theory of communication”, Bell System Tech J, 27, 379-423, 623-656.

See Also

haplo.post.prob.

Examples

## Not run: 
library(phyclust, quiet = TRUE)

data.path <- paste(.libPaths()[1], "/phyclust/data/crohn.phy", sep = "")
my.snp <- read.phylip(data.path, code.type = "SNP")
ret <- haplo.post.prob(my.snp$org, ploidy = 1)
getcut.fun(sort(ret$haplo$hap.prob, decreasing = TRUE),
           nn = my.snp$nseq, plot = 1)

## End(Not run)

snoweye/phyclust documentation built on Sept. 12, 2023, 5 a.m.