Description Usage Arguments Value Examples
View source: R/SignatureExtractionLib.R
Perform signature extraction, by applying NMF to the input matrix. Multiple NMF runs and bootstrapping is used for robustness, followed by clustering of the solutions. A range of number of signatures to be used is required.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | SignatureExtraction(
cat,
outFilePath,
blacklist = c(),
nrepeats = 10,
nboots = 20,
clusteringMethod = "PAM",
completeLinkageFlag = FALSE,
useMaxMatching = TRUE,
filterBestOfEachBootstrap = TRUE,
filterBest_RTOL = 0.001,
filterBest_nmaxtokeep = 10,
nparallel = 1,
nsig = c(3:15),
mut_thr = 0,
type_of_extraction = "subs",
project = "extraction",
parallel = FALSE,
nmfmethod = "brunet",
removeDuplicatesInCatalogue = FALSE,
normaliseCatalogue = FALSE,
plotCatalogue = FALSE,
plotResultsFromAllClusteringMethods = TRUE
)
|
cat |
matrix with samples as columns and channels as rows |
outFilePath |
path were the extraction output files should go. Remember to add "/" at the end of the path |
blacklist |
list of samples (column names) to ignore |
nrepeats |
how many runs for each bootstrap (if filterBestOfEachBootstrap=TRUE with default params, only at most 10 runs within 0.1 percent of best will be considered, so nrepeats should be at least 10) |
nboots |
how many bootstrapped catalogues to use |
clusteringMethod |
choose among "HC","PAM","MC", hierarchical clustering (HC), partitioning around the medoids (PAM) and matched clustering (MC) |
completeLinkageFlag |
if clusteringMethod="HC", use complete linkage instead of default average linkage |
useMaxMatching |
if clusteringMethod="MC", use the assignment problem algorithm (match with max similarity) instead of the stable matching algorithm (any stable match) |
filterBestOfEachBootstrap |
if TRUE only at most filterBest_nmaxtokeep of the nrepeats runs that are within filterBest_RTOL*best from the best are kept |
filterBest_RTOL |
realtive tolerace from best fit to consider a run as good as the best, RTOL=0.001 is recommended |
filterBest_nmaxtokeep |
max number of runs that should be kept that are within the relative tolerance from the best |
nparallel |
how many processing units to use |
nsig |
list of number of signatures to try |
mut_thr |
threshold of mutations to remove empty/almost empty rows and columns |
type_of_extraction |
choose among "subs","rearr","generic" |
project |
give a name to your project |
parallel |
set to TRUE to use parallel computation (Recommended) |
nmfmethod |
choose among "brunet","lee","nsNMF", this choice will be passed to the NMF::nmf function |
removeDuplicatesInCatalogue |
remove 0.99 cos sim similar samples |
normaliseCatalogue |
scale samples to sum to 1 |
plotCatalogue |
also plot the catalogue, this may crash the library if the catalogue is too big, should work up to ~300 samples |
plotResultsFromAllClusteringMethods |
if TRUE, all clustering methods are used and results are reported and plotted for all of them. If FALSE, only the requested clustering is reported |
result files will be available in the outFilePath directory
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | n_row <- 96
n_col <- 50
rnd_matrix <- round(matrix(runif(n_row*n_col,min = 0,max = 50),nrow = n_row,ncol = n_col))
colnames(rnd_matrix) <- paste0("C",1:n_col)
row.names(rnd_matrix) <- paste0("R",1:n_row)
SignatureExtraction(cat = rnd_matrix,
outFilePath = paste0("extraction_test_subs/"),
nrepeats = 10,
nboots = 2,
nparallel = 2,
nsig = 2:3,
mut_thr = 0,
type_of_extraction = "subs",
project = "test",
parallel = TRUE,
nmfmethod = "brunet")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.