affinity: Affinity Calculation.

Description Usage Arguments Value Examples

View source: R/affinity.R


Calculates affinity based on Cranmer and Gill (2013). The function performs the original method (as described in the article) and also a method that takes into account the correlation structure of the observed data that increases efficiency in making matches. Affinity is calculated by first identifying whether two observations are sufficiently ‘close’ on each variable. Consider the target observation number 1. If observation i is close to the target observation on variable j, then A[i,j] = 1 otherwise, it equals zero. Close for two discrete variables is defined by them taking on the same value. Close for continuous variables is taking on a distance no greater than 1 from each other. While this may seem restrictive and arbitrary, arguments exist in the main package function hot.deck that allows the user to set how many standard deviations equal a distance of 1 (with the cutoffSD argument.


affinity(data, index, column = NULL, R = NULL, weighted = FALSE)



A data frame or matrix of values for which affinity should be calculated.


A row number identifying the target observation. Affinity will be calculated between this observation and all others in the dataset.


A column number identifying the variable with missing information. This is only needed for the optional correlation-weighted affinity score. The correlation that is used is the correlation of all variables with the focus variable (i.e., the column).


A correlation matrix for data.


Logical indicating whether or not the correlation-weighted affinity measure should be used.


A number of missing observation-variable combinations-by-number of observations in data matrix of affinity scores.


out <- hot.deck(D)

hot.deck documentation built on Aug. 17, 2021, 5:09 p.m.