kmerSplit: Split gene groups based on similarity
In thomasp85/FindMyFriends: Microbial Comparative Genomics in R

Description Usage Arguments Value Methods (by class) See Also Examples

This function splits up gene groups based on cosine similarity of kmer feature vectors. It uses hard splitting based on a similarity cutoff where unconnected components constitutes new groups. Unlike neighborhoodSplit, paralogues cannot be forced into separate groups as information needed for this is not present.

kmerSplit(object, ...)

## S4 method for signature 'pgVirtual'
kmerSplit(object, kmerSize, lowerLimit, maxLengthDif,
  pParam)

`object`	A pgVirtual subclass
`...`	Arguments passed on
`kmerSize`	The length of kmers used for sequence similarity
`lowerLimit`	The lower limit of sequence similarity below which it will be set to 0
`maxLengthDif`	The maximum deviation in sequence length to allow. Between 0 and 1 it describes a percentage. Above 1 it describes a fixed length
`pParam`	An optional BiocParallelParam object that defines the workers used for parallelisation.

A new pgVirtual subclass object of the same class as 'object'

pgVirtual: Kmer similarity based group splitting for pgVirtual subclasses

Other group-splitting: neighborhoodSplit

# Get a grouped pangenome
pg <- .loadPgExample(withGroups = TRUE)

## Not run: 
# Split groups by similarity (Too heavy to include)
pg <- kmerSplit(pg, lowerLimit = 0.8)

## End(Not run)