findThreshold: findThreshold

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/findThreshold.R

Description

Identify a distance threshold predicting whether a pairwise distance represents a comparison between objects in the same class (within-group comparison) or different classes (between-group comparison) given a matrix providing distances between objects and the group membership of each object.

Usage

1
2
3
4
5
6
findThreshold(dmat, groups, distances, method = "mutinfo", prob = 0.5,
              na.rm = FALSE, keep.dists = TRUE, roundCuts = 2, minCuts =
              20, maxCuts = 300, targetCuts = 100, verbose = FALSE,
              depth = 1, ...)

partition(dmat, groups, include, verbose = FALSE)

Arguments

dmat

Square matrix of pairwise distances.

groups

Object coercible to a factor identifying group membership of objects corresponding to either edge of dmat.

include

vector (numeric or boolean) indicating which elements to retain in the output; comparisons including an excluded element will have a value of NA

distances

Optional output of partition provided in the place of dmat and groups

method

The method for calculating the threshold; only 'mutinfo' is currently implemented.

prob

Sets the upper and lower bounds of D as some quantile of the within class distances and between-class differences, respectively.

na.rm

If TRUE, excludes NA elements in groups and corresponding rows and columns in dmat. Ignored if distances is provided.

keep.dists

If TRUE, the output will contain the distances element (output of partition).

roundCuts

Number of digits to round cutoff values (see Details)

minCuts

Minimal length of vector of cutoffs (see Details).

maxCuts

Maximal length of vector of cutoffs (see Details)

targetCuts

Length of vector of cutoffs if conditions met by minCuts and maxCuts are not met (see Details).

verbose

Terminal output is produced if TRUE.

depth

Private argument used to track level of recursion.

...

Extra arguments are ignored.

Details

findThreshold is used internally in classify, but may also be used to calculate a starting value of $D$.

partition is used to transform a square (or lower triangular) distance matrix into a data.frame containing a column of distances ($vals) along with a factor ($comparison) defining each distance as a within- or between-group comparison. Columns $row and $col provide indices of corresponding rows and columns of dmat.

Value

In the case of findThreshold, output is a list with elements decsribed below. In the case of partition, output is the data.frame returned as the element named $distances in the output of findThreshold.

D

The distance threshold (distance cutoff corresponding to the PMMI).

pmmi

Value of the point of maximal mutual information (PMMI)

interval

A vector of length 2 indicating the upper and lower bounds over which values for the threshold are evaluated.

breaks

A data.frame with columns x and y providing candidiate breakpoints and corresponding mutual information values, respectively.

distances

If keep.distances is TRUE, a data.frame containing pairwise distances identified as within- or between classes.

method

Character corresponding to input argument method.

params

Additional input parameters.

Author(s)

Noah Hoffman

See Also

plotDistances, plotMutinfo

Examples

1
2
3
4
5
data(iris)
dmat <- as.matrix(dist(iris[,1:4], method="euclidean"))
groups <- iris$Species
thresh <- findThreshold(dmat, groups, type="mutinfo")
str(thresh)

Example output

List of 7
 $ D        : num 2.04
 $ pmmi     : num 0.395
 $ interval : num [1:2] 0.837 3.401
 $ breaks   :'data.frame':	259 obs. of  2 variables:
  ..$ x: num [1:259] 0.837 0.84 0.85 0.86 0.87 ...
  ..$ y: num [1:259] 0.238 0.24 0.243 0.246 0.25 ...
 $ distances:'data.frame':	11175 obs. of  4 variables:
  ..$ vals      : num [1:11175] 0.539 0.51 0.648 0.141 0.616 ...
  ..$ comparison: Ord.factor w/ 2 levels "within"<"between": 1 1 1 1 1 1 1 1 1 1 ...
  ..$ row       : int [1:11175] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ col       : int [1:11175] 2 3 4 5 6 7 8 9 10 11 ...
 $ method   : chr "mutinfo"
 $ params   :List of 5
  ..$ prob      : num 0.5
  ..$ roundCuts : num 2
  ..$ minCuts   : num 20
  ..$ maxCuts   : num 300
  ..$ targetCuts: num 100

clst documentation built on Nov. 8, 2020, 5:41 p.m.