One_step_clustering: Cellularity clustering

Description Usage Arguments Value Examples

View source: R/QuantumClone.R

Description

Wrap up function that clusters cellularities. This is based on the most likely possibility for each mutation, give ints frequency and genotype.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
One_step_clustering(
  SNV_list,
  FREEC_list = NULL,
  contamination,
  nclone_range = 2:5,
  clone_priors = NULL,
  prior_weight = NULL,
  Initializations = 1,
  preclustering = "FLASH",
  simulated = FALSE,
  epsilon = NULL,
  save_plot = TRUE,
  ncores = 1,
  restrict.to.AB = FALSE,
  output_directory = NULL,
  model.selection = "BIC",
  optim = "default",
  keep.all.models = FALSE,
  force.single.copy = FALSE
)

Arguments

SNV_list

A list of dataframes (one for each sample), with as columns : (for the first column of the first sample the name of the sample), the chromosome "Chr",the position of the mutation "Start", the number of reads supporting variant "Alt", as well as the total number of reads overlapping position "Depth", and if the output from FREEC for the samples are not associated, the genotype "Genotype".

FREEC_list

list of dataframes from FREEC for each samples (usually named Sample_ratio.txt), in the same order as SNV_list

contamination

Numeric vector describind the contamination in all samples (ranging from 0 to 1). Default is 0. No longer used for clustering.

nclone_range

A number or range of clusters that should be used for clustering

clone_priors

List of vectors with the putated position of clones

prior_weight

Numeric with the proportion mutations in each clone

Initializations

Number of initial conditions to be tested for the EM algorithm

preclustering

The type of preclustering used for priors: "Flash","kmedoid" or NULL. NULL will generate centers using uniform distribution. WARNING: overrides priors given

simulated

Should be TRUE if the data has been been generated by the QuantumCat algorithm

epsilon

Stop value: maximal admitted value of the difference in cluster position and weights between two optimization steps. If NULL, will take 1/(average depth)

save_plot

Should the plots be saved? Default is TRUE

ncores

Number of cores to be used during EM algorithm

restrict.to.AB

Boolean: Should the analysis keep only sites located in A and AB sites in all samples?

output_directory

Directory in which to save results

model.selection

The function to minimize for the model selection: can be "AIC", "BIC", or numeric. In numeric, the function uses a variant of the BIC by multiplication of the k*ln(n) factor. If >1, it will select models with lower complexity.

optim

use L-BFS-G optimization from R ("default"), or from optimx ("optimx"), or Differential Evolution ("DEoptim")

keep.all.models

Should the function output the best model (default; FALSE), or all models tested (if set to true)

force.single.copy

Should all mutations in overdiploid regions set to single copy? Default is FALSE

Value

list of lists

Top level

List of all possibilities

dataframe

Table of hierarchical relations

numeric

probability of this tree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Mutations<-QuantumClone::Input_Example
 for(i in 1:2){
 Mutations[[i]]<-cbind(rep(paste("Example_",i,sep=""),times=10),Mutations[[i]])
 colnames(Mutations[[i]])[1]<-"Sample"
}
print("The data should look like this:")
print(head(Mutations[[1]]))

cat("Cluster data: will try to cluster between 3 and 4 clones, with 1 maximum search each time,
      and will use priors from preclustering (e.g. k-medoids on A and AB sites)")
print("The genotype is provided in the list frame, and
          there is no associated data from FREEC to get genotype from.")
print("The computation will run on a single CPU.")
Clustering_output<-One_step_clustering(SNV_list = Mutations,
FREEC_list = NULL,contamination = c(0,0),nclone_range = 3:4,
clone_priors = NULL,prior_weight = NULL ,
Initializations = 1,preclustering = "FLASH", simulated = TRUE,
save_plot = FALSE,ncores=1,output_directory=NULL)
print("The data can be accessed by Clustering_output$filtered_data")
print("All plots are now saved in the working directory")

DeveauP/QuantumClone documentation built on Oct. 29, 2021, 8:56 a.m.