simRank: Compute the SimRank Similarity between Sets of Sequences

Description Usage Arguments Details Value Author(s) References Examples

Description

Computes the SimRank similarity (number of shared unique k-mers over the smallest number of unique k-mers.)

Usage

1
simRank(x, k = 7)

Arguments

x

an object of class DNAStringSet containing the sequences.

k

size of used k-mers.

Details

distSimRank() returns 1-simRank().

Value

simRank() returns a similarity object of class "simil" (see proxy). distSimRank() returns a dist object.

Author(s)

Michael Hahsler

References

Santis et al, Simrank: Rapid and sensitive general-purpose k-mer search tool, BMC Ecology 2011, 11:11

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
### load sequences
sequences <- readDNAStringSet(system.file("examples/DNA_example.fasta",
	package="rMSA"))
sequences

### compute similarity
simil <- simRank(sequences)

### use hierarchical clustering
hc <- hclust(distSimRank(sequences))
plot(hc)

mhahsler/rMSA documentation built on May 22, 2019, 8:55 p.m.