non_redundant_hier: Non Redundant set of samples using hierarchical search.

View source: R/non_redundant_hier.R

non_redundant_hierR Documentation

Non Redundant set of samples using hierarchical search.

Description

Creates a non-redundant sub-set of samples. The function accept an accnet or mash object and returns a nr_list object. The sub-set can be created using a certain distance (mash o jaccard distance), a specific number of elements or a fraction of the whole sample set. The function performes an iterative seach so sometimes the exact number of returned elements could be not the same that the specified in the input. This difference can be defined with the threshold parameter. This function perform a hierarchical search. For small datasets (<2000) non_redundant() could be faster. This function comsume less memory than the original so it is better for large datasets.

Usage

non_redundant_hier(
  data,
  number,
  fraction,
  distance,
  tolerance = 0.05,
  partitions = 10,
  max_iter = 10000,
  fast = FALSE,
  snps
)

Arguments

data

An accnet or mash object.

number

The number of non-redundant samples.

fraction

The fraction of the whole set of non-redundant samples.

distance

Minimun distance among samples.

tolerance

Percentage of error between the input number and the final number of samples.

partitions

Number of partitions to hiaerarchical search.

max_iter

Maximun number of search iterations.

fast

If fast is TRUE the clustering process uses "components" in other case use "fast_greddy".

Value

nr_list object

See Also

extract_non_redundant


irycisBioinfo/PATO documentation built on Oct. 19, 2023, 3:07 p.m.