process_files_groups: Process a group of files for clustering

Description Usage Arguments

View source: R/cluster.R

Description

Process a group of files for clustering

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
process_files_groups(
  files,
  col.names,
  num.clusters,
  num.samples,
  asinh.cofactor,
  downsample.to,
  output.dir,
  negative.values,
  quantile.prob
)

Arguments

files

A vector of strings. The first string in the vector corresponds to the name to be used for the clustering output, the remaining strings are the paths of the files that will be pooled together for clustering

col.names

A vector of column names indicating which columns should be used for clustering

num.clusters

The desired number of clusters

num.samples

Number of samples to be used for the CLARA algorithm (see cluster::clara)

asinh.cofactor

Cofactor for asinh transformation. If this is NULL no transformation is performed

downsample.to

The number of events that should be randomly sampled from each file before pooling. If this is 0, no sampling is performed

output.dir

The name of the output directory, it will be created if it does not exist

negative.values

How to deal with negative values in the data. If this is NULL negative values are left as is. Otherwise two options are possible:

  • truncate: Negative values will be truncated (i.e. replaced with 0)

  • shift: The data will be shifted so that only quantile.prob of the values for each channel will be truncated to 0. This option is useful in cases where the range of data significantly extends in the negatives, for instance due to compensation.

quantile.prob

Only used if negative.value is set to shift. The quantile of measurements that are going to be truncated to 0. For instance if this is 0.05, the data will be shifted so that only 5 percent of the values are negative and will be truncated to 0


ParkerICI/grappolo documentation built on April 8, 2021, 11:03 a.m.