matching: A pair distance for binary/ categorical variables

View source: R/matching.R

matchingR Documentation

A pair distance for binary/ categorical variables

Description

This function computes the simple matching distance from two data frames/ matrices.

Usage

matching(x, y)

Arguments

x

A first data frame or matrix (see Details).

y

A second data frame or matrix (see Details).

Details

The x and y arguments have to be data frames/ matrices with the same number of columns where the row indicates the object and the column is the variable. This function calculates all pairwise distance between rows in the x and y data frames/ matrices. If the x data frame/ matrix is equal to the y data frame/ matrix, it explicitly calculates all distances in the x data frame/ matrix.

The simple matching distance between objects i and j is calculated by

d_{ij} = \frac{∑_{s=1}^{P}(x_{is}-x_{js})}{P}

where P is the number of variables, and x_{is}-x_{js} \in {0, 1}. x_{is}-x_{js} = 0, if x_{is}=x_{js} and x_{is}-x_{js} = 1, when x_{is} \neq x_{js}.

As an example, the distance between objects 1 and 2 is presented.

object x y z
1 1 2 2
2 1 2 1

The distance between objects 1 and 2 is

d_{12} = \frac{∑_{s=1}^{3}(x_{is}-x_{js})}{3} = \frac{0 + 0 + 1}{3} = \frac{1}{3} = 0.33

Value

Function returns a distance matrix with the number of rows equal to the number of objects in the x data frame/ matrix (n_x) and the number of columns equals to the number of objects in the y data frame/ matrix (n_y).

Author(s)

Weksi Budiaji
Contact: budiaji@untirta.ac.id

Examples

set.seed(1)
a <- matrix(sample(1:2, 7*3, replace = TRUE), 7, 3)
matching(a, a)


kmed documentation built on Aug. 29, 2022, 9:06 a.m.