ddPCRclust: ddPCRclust A package for automated quantification of...
In bgbrink/ddPCRclust: Clustering algorithm for ddPCR data

Description Usage Arguments Value Usage Author(s) See Also Examples

View source: R/ddPCRclust.R

The ddPCRclust algorithm can automatically quantify the events of ddPCR reaction with up to four markers. In order to determine the correct droplet count for each marker, it is crucial to both identify all clusters and label them correctly based on their position. For more information on what data can be analyzed and how a template needs to be formatted, please check the project repository on github.

This is the main function of this package. It automatically runs the ddPCRclust algorithm on one or multiple csv files containing the raw data from a ddPCR run with up to 4 markers.

1
2
3

ddPCRclust(files, template, numOfMarkers = 4, sensitivity = 1,
  similarityParam = 0.95, distanceParam = 0.2, fast = FALSE,
  multithread = FALSE)

`files`	The input data obtained from the csv files. For more information, please see `readFiles`.
`template`	A data frame containing information about the individual ddPCR runs. An example template is provided with this package. For more information, please see `readTemplate`.
`numOfMarkers`	The number of primary clusters that are expected according the experiment set up. Can be ignored if a template is provided. Else, a vector with length equal to `length(files)` should be provided, containing the number of markers used for the respective reaction.
`sensitivity`	A number between 0.1 and 2 determining sensitivity of the initial clustering, e.g. the number of clusters. A higher value means the data is divided into more clusters, a lower value means more clusters are merged. This allows fine tuning of the algorithm for exceptionally low or high CPDs.
`similarityParam`	If the distance of a droplet between two or more clusters is very similar, it will not be counted for either. The standard it 0.95, i.e. at least 95% similarity. A sensible value lies between 0 and 1, where 0 means none of the 'rain' droplets will be counted and 1 means all droplets will be counted.
`distanceParam`	When assigning rain between two clusters, typically the bottom 20% are assigned to the lower cluster and the remaining 80% to the higher cluster. This parameter changes the ratio, i.e. a value of 0.1 would assign only 10% to the lower cluster.
`fast`	Run a simpler version of the algorithm that is about 10x faster. For clean data, this might already deliver very good results. However, is is mostly intended to get a quick overview over the data.
`multithread`	Distribute the algorithm amongst all CPU cores to speed up the computation.

results

The results of the ddPCRclust algorithm. It contains three fields:
data The original input data minus the removed events (for plotting)
confidence The agreement between the different clustering results in percent If all parts of the algorithm calculated the same result, the clustering is likely to be correct, thus the confidence is high
counts The droplet count for each cluster

The main function of the package is ddPCRclust. This function runs the algorithm with one or multiple files, automatically distributing them amongst all cpu cores using the parallel package (parallelization does not work on windows). Afterwards, the results can be exported in different ways, using exportPlots, exportToExcel and exportToCSV. Once the clustering is finished, copies per droplet (CPD) for each marker can be calculated using calculateCPDs.

These functions provide access to all functionalities of the ddPCRclust package. However, expert users can directly call some internal functions of the algorithm, if they find it necessary. Here is a list of all available supplemental functions:
runDensity
runSam
runPeaks
createEnsemble

Maintainer: Benedikt G. Brink bbrink@cebitec.uni-bielefeld.de

Other contributors:

Justin Meskas jmeskas@bccrc.ca [contributor]
Ryan R. Brinkman rbrinkman@bccrc.ca [contributor]

Useful links:

https://github.com/bgbrink/ddPCRclust
Report bugs at https://github.com/bgbrink/ddPCRclust/issues

# Read files
exampleFiles <- list.files(paste0(find.package('ddPCRclust'), '/extdata'), full.names = TRUE)
files <- readFiles(exampleFiles[3])
# To read all example files uncomment the following line
# files <- readFiles(exampleFiles[1:8])

# Read template
template <- readTemplate(exampleFiles[9])

# Run ddPCRclust
result <- ddPCRclust(files, template)

# Plot the results
library(ggplot2)
p <- ggplot(data = result$B01$data, mapping = aes(x = Ch2.Amplitude, y = Ch1.Amplitude))
p <- p + geom_point(aes(color = factor(Cluster)), size = .5, na.rm = TRUE) +
  ggtitle('B01 example')+theme_bw() + theme(legend.position='none')
p