Description Usage Arguments Details Value See Also Examples
Clone / Sub-clone decomposition of DNA sequencing data. This is recommended to be used for more than one sample preferably collected from
the same individual at different times. If the sample qualities vary, it is recommended to perform scaling first with seqn.scale
.
1 2 3 4 5 6 7 8 9 10 11 12 |
data |
A |
sample |
|
vaf |
|
allele.comp |
|
n.clone |
Optional |
n.subclone |
Optional |
optimization.method |
Method to find optimal number of clusters; GMM or bootstrap. Default is GMM. |
clustering.method |
Clustering methods; HKM, bootkm or hybrid. Default is hkm. |
clonality |
Method for determining clonality of the predicted clusters; Allelic composition (default) or density |
instruct |
|
cluster.doc
is meant to do two things, first determine the optimum number of clusters that should be fitted and
second, to infer what groups the clusters thus obtained should be assigned to.
The data inputs interactively requested from the user help obtain the following information
chromosomal segmentation
helps in determining the number of clone/sub-clone cloud to be expected in the data. As
variant alleles from different aberrant chromosomes may have similar relative frequencies but discordant clonal
interpretation. On the contrary convergent clonal alleles may demonstrate divergent frequencies if arisen from dissimilar aneuploidy.
clouds
give the program a visual feedback from the user that assume to carry some biological interpretation of the
frequency distributions present in the data. This is a subjective estimate that the program later uses for cluster assignment.
Out of the two methods used for cluster optimization, GMM stands for Gaussian Mixed Models whereas bootstrap,
as the name suggests perform bootstrap resampling of the VAFs in 50 repetitions with 20 runs each to find the most stable parameter
for clustering. GMM outputs the optimization curve with BIC
and AIC against number of clusters chosen in the
X-axis
where bootstrap shows the Smin
statistics instead in the Y-axis
. Where as gap
calculates the gap statistics for each clustering. In all cases the statistics are to be interpreted as proxies for the entropy of the
system. The maximum entropy is likely to indicate the most stable solution.
clustering.method
gives the user three choices:
HKM
is Heierarchical K-means clustering which uses heierarchical clustering first to determine the cluster centers
that are subsequently used as the starting point for the K-means clustering.
bootkm
performs a bootstrap resampling of 20 fitted K-means clusters with 50 resamplings to out put the clusters.
hybrid
performs hkm on the principal component of the data.
clonality
provides two choices for clonality assignment. The default is Allelic composition that measures expected
clonality patterns according to the copy numbers. But in cases of unreliable allelic composition estimates this method may fail. In such
situations the clonality can be assigned without apriori assumptions with the alternate density based method.
A list of 12 objects is returned that includes all the summary statistics, diagnositics and the predictions as well as the mapping internally used for clonal deconvolution.
predicted.data
is necessarily an extension to the input data
with the addition of the predicted clone and sub-clone
status of each variant for corresponding samples.
density.map
is a distance matrix convoluted from cluster distances and desity departures.
collapse
are clusters that are initially prredicted but later collapsed on each other dues to similarity between them.
fitted.hkm, fitted bootkm or fitted.hybrid
is a vector of initial cluster assignment by the algorithm chosen.
Only one of these will have an output and the rest will show NA
.
Number of unscaled clusters
gives umber of predicted clusters before collapsing with density estimates.
Number of scaled clusters
gives number of predicted clusters after collapsing (if any).
cluster.diagnostics
if the optimization method was chosen to be GMM, this is an object of S3
class
that includes clustering diagnostics from the model-based clustering. If the chosen method was bootstrap then this is a list.
cluster centers
are the centroids of the predicted scaled clusters.
cluster mapping
provides the map between scaled clusters and the clonal deconvolution assignments
Dunn index
is the Dunn index for the fitted cluster.
1 | #cluster.doc(test.dat, 1, 2, optimization.method = 'GMM', clustering.method = 'HKM')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.