One step analysis function

Share:

Description

Sequentially calls a function to test all accessible cellularities for all mutations in the samples,then cluster them, and finally draws phylogenetic trees based on the uncovered cellularities

Usage

1
2
3
4
5
6
QuantumClone(SNV_list, FREEC_list = NULL, contamination, nclone_range = 2:5,
  clone_priors = NULL, prior_weight = NULL, simulated = FALSE,
  save_plot = TRUE, epsilon = 5 * (10^(-3)), maxit = 2,
  preclustering = TRUE, timepoints = NULL, ncores = 1,
  output_directory = NULL, model.selection = "BIC", optim = "default",
  keep.all.models = FALSE, force.single.copy = FALSE)

Arguments

SNV_list

A list of dataframes (one for each sample), with as columns : (for the first column of the first sample the name of the sample), the chromosome "Chr",the position of the mutation "Start", the number of reads supporting the mutation "Alt", the depth of coverage at this locus "Depth", and if the output from FREEC for the samples are not associated, the genotype "Genotype".

FREEC_list

list of dataframes from FREEC for each samples (usually named Sample_ratio.txt), in the same order as SNV_list

contamination

Numeric vector describind the contamination in all samples (ranging from 0 to 1). Default is 0.

nclone_range

A number or range of clusters that should be used for clustering

clone_priors

List of vectors with the putated position of clones

prior_weight

Numeric with the proportion mutations in each clone

simulated

Should be TRUE if the data has been been generated by the QuantumCat algorithm

save_plot

Should the plots be saved? Default is TRUE

epsilon

Stop value: maximal admitted value of the difference in cluster position and weights between two optimization steps.

maxit

Number of initial conditions to be tested for the EM algorithm

preclustering

Boolean: should a kmeans be performed on A and AB sites to determine priors

timepoints

a numeric vector indicating if the samples are from different timepoints or tumors (e.g. one tumor and metastates) If NULL, all samples are considered from the same tumor.

ncores

Number of cores to be used during EM algorithm

output_directory

Path to output directory

model.selection

The function to minimize for the model selection: can be "AIC", "BIC", or numeric. In numeric, the function uses a variant of the BIC by multiplication of the k*ln(n) factor. If >1, it will select models with lower complexity.

optim

use L-BFS-G optimization from R ("default"), or from optimx ("optimx")

keep.all.models

Should the function output the best model (default; FALSE), or all models tested (if set to true)

force.single.copy

Should all mutations in overdiploid regions set to single copy? Default is FALSE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Mutations<-QuantumClone::Input_Example
 for(i in 1:2){
 Mutations[[i]]<-cbind(rep(paste("Example_",i,sep=""),times=10),Mutations[[i]])
 colnames(Mutations[[i]])[1]<-"Sample"
}
print("The data should look like this:")
print(head(Mutations[[1]]))

cat("Cluster data: will try to cluster between 3 and 4 clones, with 1 maximum search each time,
      and will use priors from preclustering (e.g. k-medoids on A and AB sites)")
print("The genotype is provided in the list frame, and
          there is no associated data from FREEC to get genotype from.")
print("The computation will run on a single CPU.")
Clustering_output<-QuantumClone(SNV_list = Mutations,
FREEC_list = NULL,contamination = c(0,0),nclone_range = 3:4,
clone_priors = NULL,prior_weight = NULL ,
maxit = 1,preclustering = TRUE, simulated = TRUE,
save_plot = TRUE,ncores=1,output_directory="Example")
print("The data can be accessed by Clustering_output$filtered_data")
print("All plots are now saved in the working directory")

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.