knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=7,
  fig.height=6
)

transcluster implements new methods for the partition of data obtained from whole genome sequencing (WGS) studies into transmission clusters. Inputs include:

Functionality

Some available functions in transcluster are:

withinCluster: test whether two cases are within the same transmission cluster

createModel: create model to be used for cluster analysis

makeSNPClusters: make SNP-based clusters for each threshold set in the model

makeTransClusters: make transmission-based clusters for each threshold set in the model

setDatesFromFile: set sample dates from a file

setRegionsFromFile: set region data from a file

setParams: set parameters for existing cluster analysis model

setSNPFromFile: set SNP distance matrix from a file

setSNPThresholds: set SNP thresholds for existing cluster analysis model

setTransMatrixFromFile: set transmission distance matrix from a file

setTransThresholds: set transmission thresholds for existing cluster analysis model

setSNPThresholds: set SNP thresholds for existing cluster analysis model

Plotting functions:

plotSNPClusters: plot of SNP clusters with weighted edges

plotTransClusters: plot of transmission clusters with weighted edges

plotTransClustersSpatial: plot of transmission clusters using region data

A simple example

Start by loading the libraries:

library(transcluster)
library(stats)
library(clue)
library(igraph)

Then create some data. This is the same as the first example in the paper.

ids <- c('A', 'B', 'C', 'D')
dates <- c(2018, 2014, 2016, 2016)
snpMatrix <- matrix(c(0,5,7,7,5,0,8,8,7,8,0,6,7,8,6,0), nrow=4, ncol=4)

Set up the model:

myModel <- createModel(ids, dates, snpMatrix)
myModel <- setParams(myModel, lambda=1.5, beta=2.3)
myModel <- setSNPThresholds(myModel, c(5,6,7))
myModel <- setTransThresholds(myModel, c(7,8,9))

Create the clusters:

mySNPClusters <- makeSNPClusters(myModel, 'test')
myTransClusters <- makeTransClusters(myModel, 'test')

Examine the results using igraph:

library(transcluster)
myModel <- setCutoffs(myModel)
plotTransClusters(myModel, vSize=20, vFontSize=2, level=7, thick=3, labelOffset=0)
plotTransClusters(myModel, vSize=20, vFontSize=2, level=8, thick=3, labelOffset=0)
plotTransClusters(myModel, vSize=20, vFontSize=2, level=9, thick=3, labelOffset=0)

And compare these to the clusters obtained with SNP-based clustering:

plotSNPClusters(myModel, vColor='orange', vSize=20, vFontSize=2, level=5, thick=3, labelOffset=0)
plotSNPClusters(myModel, vColor='orange', vSize=20, vFontSize=2, level=6, thick=3, labelOffset=0)
plotSNPClusters(myModel, vColor='orange', vSize=20, vFontSize=2, level=7, thick=3, labelOffset=0)

The results also appear in .csv files saved locally.

More examples

Test directly whether two cases are in the same cluster:

withinCluster(N = 3, k_cutoff = 5, t1 = 2012.13, t2 = 2015.76, lambda = 0.7, beta = 1.3, perc_cutoff = 0.2)

British Columbia example

This time we load sample date and SNP distance data from file.

bcModel <- createModel()
fileName <- system.file("extdata", "mclust12_dates_R.csv", package = "transcluster", mustWork = TRUE)
bcModel <- setDatesFromFile(bcModel, fileName)
fileName <- system.file("extdata", "bc_snp_matrix_R.csv", package = "transcluster", mustWork = TRUE)
bcModel <- setSNPFromFile(bcModel, fileName)

And we set up the other parameters.

bcModel <- setParams(bcModel, lambda=1.5, beta=2.0)
bcModel <- setCutoffs(bcModel)   #takes a few seconds
bcModel <- setSNPThresholds(bcModel, c(4))
bcModel <- setTransThresholds(bcModel, c(3))

Create and plot.

bcSNPClusters <- makeSNPClusters(bcModel, nameBase='test', writeFile=F)
bcTClusters <- makeTransClusters(bcModel, nameBase='test', writeFile=F)

plotSNPClusters(bcModel, level=4, vColor='orange',vSize=4, thick = 1.25)
plotTransClusters(bcModel, level=3, vColor='lightblue',vSize=4, thick = 2, vFontSize=1)

Add region data to inform the clustering.

regFileName <- system.file("extdata", "example_region_data.csv", package = "transcluster", mustWork = TRUE)
bcModel <- setRegionsFromFile(bcModel, regFileName)
bcModel <- setCutoffsSpatial(bcModel)
plotTransClustersSpatial(bcModel, level=3, vSize=4, thick=2, vFontSize=1)


JamesStimson/transcluster documentation built on May 24, 2020, 10:39 p.m.