kmerSplit: Split gene groups based on similarity

Description Usage Arguments Value Methods (by class) See Also Examples

Description

This function splits up gene groups based on cosine similarity of kmer feature vectors. It uses hard splitting based on a similarity cutoff where unconnected components constitutes new groups. Unlike neighborhoodSplit, paralogues cannot be forced into separate groups as information needed for this is not present.

Usage

1
2
3
4
5
kmerSplit(object, ...)

## S4 method for signature 'pgVirtual'
kmerSplit(object, kmerSize, lowerLimit, maxLengthDif,
  pParam)

Arguments

object

A pgVirtual subclass

...

Arguments passed on

kmerSize

The length of kmers used for sequence similarity

lowerLimit

The lower limit of sequence similarity below which it will be set to 0

maxLengthDif

The maximum deviation in sequence length to allow. Between 0 and 1 it describes a percentage. Above 1 it describes a fixed length

pParam

An optional BiocParallelParam object that defines the workers used for parallelisation.

Value

A new pgVirtual subclass object of the same class as 'object'

Methods (by class)

See Also

Other group-splitting: neighborhoodSplit

Examples

1
2
3
4
5
6
7
8
# Get a grouped pangenome
pg <- .loadPgExample(withGroups = TRUE)

## Not run: 
# Split groups by similarity (Too heavy to include)
pg <- kmerSplit(pg, lowerLimit = 0.8)

## End(Not run)

thomasp85/FindMyFriends documentation built on April 25, 2020, 1:06 p.m.