gower_dist: Gower's distance

Description Usage Arguments Value Details Note References See Also

Description

Compute Gower's distance, pairwise between records in two data sets x and y. Records from the smallest data set are recycled over.

Usage

1
2
gower_dist(x, y, pair_x = NULL, pair_y = NULL, eps = 1e-08,
  nthread = getOption("gd_num_thread"))

Arguments

x

[data.frame]

y

[data.frame]

pair_x

[numeric|character] (optional) Columns in x used for comparison. See Details below.

pair_y

[numeric|character] (optional) Columns in y used for comparison. See Details below.

eps

[numeric] (optional) Computed numbers (variable ranges) smaller than eps are treated as zero.

nthread

Number of threads to use for parallelization. By default, for a dual-core machine, 2 threads are used. For any other machine n-1 cores are used so your machine doesn't freeze during a big computation. The maximum nr of threads are determined from omp::get_max_threads.

Value

A numeric vector of length max(nrow(x),nrow(y)).

Details

There are three ways to specify which columns of x should be compared with what columns of y. The first option is do give no specification. In that case columns with matching names will be used. The second option is to use only the pairs_y argument, specifying for each column in x in order, which column in y must be used to pair it with (use 0 to skip a column in x). The third option is to explicitly specify the columns to be matched using pair_x and pair_y.

Note

Gower (1971) originally defined a similarity measure (s, say) with values ranging from 0 (completely dissimilar) to 1 (completely similar). The distance returned here equals 1-s.

References

Gower, John C. "A general coefficient of similarity and some of its properties." Biometrics (1971): 857-871.

See Also

gower_topn



Search within the gower package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.