fine: Expand a Distance Matrix for Matching with Fine Balance.

Description Usage Arguments Details Value Author(s) References Examples

Description

In optimal pair matching with fine balance, expand a distance matrix to become a square matrix to enforce fine balance. The method is discussed in Chapter 10 of Design of Observational Studies (2010), and it is conceptually the simplest way to implement fine balance; therefore, it remains very useful for teaching and for self-study. See details. For practical work, consider also the rcbalance package, the designmatch package and the bigmatch package; see the references.

Usage

1
fine(dmat, z, f, mult = 100)

Arguments

dmat

A distance matrix with one row for each treated individual and one column for each control. Often, this is either the Mahalanobis distance based on covariates, mahal(), or else a robust variant produced by smahal(). The distance matrix dmat may have been penalized by addalmostexact() or addcaliper(). An error will result unless dmat has more columns than rows.

z

z is a vector that is 1 for a treated individual and 0 for a control. The number of treated subjects, sum(z), must equal the number of rows of dmat, and the number of potential controls, sum(1-z), must equal the number of columns of dmat.

f

A factor or vector to be finely balanced. Must have length(f)=length(z).

mult

A positive number, mult>0. Determines the penalty used to enforce fine balance as max(dmat)*mult. The distance matrix dmat may have been penalized by addalmostexact() or addcaliper(), and in this case it makes sense to set mult=1 or mult=2, rather than the default, mult=100. If dmat is already penalized, taking mult>1 creates a larger penalty for deviations from fine balance than the exisiting penalties.

Details

The method is discussed in Chapter 10 of Design of Observational Studies (2010), and it is conceptually the simplest way to implement fine balance. However, the expanded distance matrix can become quite large, so this simplest method is not the most efficient method in its use of computer storage. A more compact implementation uses minimum cost flow in a network (Rosenbaum 1989). Additionally, there are several extensions of fine balance, including near-fine balance (Yang et al. 2012), fine balance for several covariates via integer programming (Zubizarreta 2012, designmatch R-package), and refined balance (Pimentel et al. 2015, rcbalance R-package). Ruoqi Yu's bigmatch R-package implements fine balance and near-fine balance in very large matching problems.

Value

An expanded, square distance matrix with "extra" treated units for use in optimal pair matching. Any control paired with an "extra" treated unit is discarded, as are the "extra" treated units.

Author(s)

Paul R. Rosenbaum

References

Hansen, B. B. (2007). Flexible, optimal matching for observational studies. R News, 7, 18-24. (optmatch package)

Pimentel, S. D., Kelz, R. R., Silber, J. H. and Rosenbaum, P. R. (2015). Large, sparse optimal matching with refined covariate balance in an observational study of the health outcomes produced by new surgeons. Journal of the American Statistical Association, 110, 515-527. Introduces an extension of fine balance called refined balance.

Pimentel, S. D. (2016) Large, Sparse Optimal Matching with R Package rcbalance. Observational Studies, 2, 4-23. An introduction to the rcbalance package.

Pimentel, S. D. (2016). R Package rcbalance. A recommended R package implementing both fine balance and an extension, refined balance.

Rosenbaum, P. R. (1989). Optimal matching for observational studies. Journal of the American Statistical Association, 84(408), 1024-1032. Discusses and illustrates fine balance using minimum cost flow in a network.

Rosenbaum, P. R., Ross, R. N. and Silber, J. H. (2007). Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer. Journal of the American Statistical Association, 102, 75-83. Discusses and illustrates fine balance using optimal assignment.

Rosenbaum, P. R. (2010). Design of Observational Studies. New York: Springer. The method and example are discussed in Chapter 10.

Yang, D., Small, D. S., Silber, J. H. and Rosenbaum, P. R. (2012). Optimal matching with minimal deviation from fine balance in a study of obesity and surgical outcomes. Biometrics, 68, 628-636. Extension of fine balance useful when fine balance is infeasible. Comes as close as possible to fine balance. Implemented in Pimentel's rcbalance package.

Yu, Ruoqi (2018). R package bigmatch.

Zubizarreta, J. R., Reinke, C. E., Kelz, R. R., Silber, J. H. and Rosenbaum, P. R. (2011). Matching for several sparse nominal variables in a case-control study of readmission following surgery. The American Statistician, 65(4), 229-238. Combines near-exact matching with fine balance for the same covariate.

Zubizarreta, J. R. (2012). Using mixed integer programming for matching in an observational study of kidney failure after surgery. Journal of the American Statistical Association, 107, 1360-1371. Extends the concept of fine balance using integer programming. Implemented in R in the designmatch package.

Zubizarreta, J. R. designmatch. A recommended R package for fine balance using integer programming.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
data(costa)
z<-1*(costa$welder=="Y")
aa<-1*(costa$race=="A")
smoker=1*(costa$smoker=="Y")
age<-costa$age
x<-cbind(age,aa,smoker)
dmat<-mahal(z,x)
# Mahalanobis distances
round(dmat[,1:6],2) # Compare with Table 8.5 in Design of Observational Studies (2010)
# Impose propensity score calipers
prop<-glm(z~age+aa+smoker,family=binomial)$fitted.values # propensity score
# Mahalanobis distanced penalized for violations of a propensity score caliper.
# This version is used for numerical work.
dmat<-addcaliper(dmat,z,prop,caliper=.5)
round(dmat[,1:6],2) # Compare with Table 8.5 in Design of Observational Studies (2010)
# Because dmat already contains large penalties, we set mult=1.
dmat<-fine(dmat,z,aa,mult=1)
dmat[,1:6] # Compare with Table 10.1 in Design of Observational Studies (2010)
dim(dmat) # dmat has been expanded to be square by adding 5 extras, here numbered 48:52
# Any control matched to an extra is discarded.
## Not run: 
# Find the minimum distance match within propensity score calipers.
optmatch::pairmatch(dmat)
# Any control matched to an extra is discarded.  For instance, the optimal match paired
# extra row 48 with the real control in column 7 to form matched set 1.22, so that control
# is not part of the matched sample.  The harmless warning message from pairmatch
# reflects the divergence between the costa data.frame and expanded distance matrix.

## End(Not run)
# Conceptual versions with infinite distances for violations of propensity caliper.
dmat[dmat>20]<-Inf
round(dmat[,1:6],2) # Compare with Table 10.1 in Design of Observational Studies (2010)

Example output

       1     2     3     4     5     6
27  8.65 12.82  6.15  0.08  2.95  4.02
28  7.84  7.40  4.41  0.33  0.74  1.31
29 13.33 12.21  0.51  4.26  5.05  5.70
30  6.51 28.55 12.29 11.42 16.20 17.65
31 13.85 15.80  1.66  4.08  6.51  7.49
32 13.33 12.21  0.51  4.26  5.05  5.70
33 13.95 26.58 13.18  3.47 10.85 12.82
34 13.46  9.27  0.02  5.09  4.24  4.56
35 13.33 12.21  0.51  4.26  5.05  5.70
36  0.51 19.40 14.89  7.85 10.20 11.17
37 13.31 10.65  0.18  4.59  4.56  5.05
38  9.24 14.95  7.06  0.33  4.02  5.25
39  9.60 16.08  7.57  0.51  4.61  5.93
40  8.92 13.87  6.58  0.18  3.47  4.61
41 13.33 12.21  0.51  4.26  5.05  5.70
42 10.00 17.25  8.13  0.74  5.25  6.65
43 13.85 15.80  1.66  4.08  6.51  7.49
44  9.41  2.05  4.56  3.47  0.18  0.02
45 13.40 13.04  0.74  4.15  5.35  6.08
46  8.92 13.87  6.58  0.18  3.47  4.61
47 13.40 13.04  0.74  4.15  5.35  6.08
        1      2      3      4      5      6
27 237.66 293.68   6.15   0.08 141.69 171.85
28 115.36 166.78  50.06   0.33  17.98  47.65
29 358.16 408.89  20.23  75.94 259.61 289.35
30 287.20 361.09  12.29  18.96 206.61 237.16
31 439.02 492.83 101.71 156.10 341.41 371.48
32 358.16 408.89  20.23  75.94 259.61 289.35
33 467.74 532.22 141.84 184.10 374.36 405.42
34 273.83 321.48   0.02   5.09 174.33 203.74
35 358.16 408.89  20.23  75.94 259.61 289.35
36   0.51  45.85 193.46 134.45  10.20  11.17
37 316.15 365.34   0.18  34.28 217.12 246.70
38 280.67 338.23   7.06   0.33 185.17 215.50
39 302.27 360.61   7.57  20.03 207.01 237.42
40 259.11 315.90   6.58   0.18 163.37 193.61
41 358.16 408.89  20.23  75.94 259.61 289.35
42 323.84 382.94   8.13  41.42 228.81 259.30
43 439.02 492.83 101.71 156.10 341.41 371.48
44   9.41  15.61 196.03 142.97   0.18   0.02
45 378.88 430.37  41.10  96.48 280.55 310.38
46 259.11 315.90   6.58   0.18 163.37 193.61
47 378.88 430.37  41.10  96.48 280.55 310.38
             1         2            3            4           5            6
27 237.6551536 293.68070   6.14895280   0.08204232 141.6866187 171.84554187
28 115.3635567 166.77795  50.06077475   0.32816928  17.9835210  47.65019022
29 358.1646162 408.89343  20.23035687  75.94463291 259.6074752 289.34976423
30 287.1996178 361.09173  12.29407928  18.95822937 206.6144194 237.15537804
31 439.0223767 492.82529 101.71379365 156.10254755 341.4062362 371.47669455
32 358.1646162 408.89343  20.23035687  75.94463291 259.6074752 289.34976423
33 467.7369256 532.21625 141.84286883 184.09780178 374.3561422 405.41753090
34 273.8275549 321.48227   0.02051058   5.09327945 174.3294133 203.74353308
35 358.1646162 408.89343  20.23035687  75.94463291 259.6074752 289.34976423
36   0.5127645  45.85359 193.46300518 134.45437259  10.2044862  11.16724044
37 316.1468237 365.33859   0.18459522  34.27676332 217.1191824 246.69738679
38 280.6718198 338.23441   7.05784083   0.32816928 185.1737852 215.49679300
39 302.2741113 360.60523   7.57381658  20.03467917 207.0113268 237.41637690
40 259.1066347 315.90070   6.58288623   0.18459522 163.3733499 193.61431541
41 358.1646162 408.89343  20.23035687  75.94463291 259.6074752 289.34976423
42 323.8369677 382.93661   8.13081350  41.42257413 228.8094333 259.29652578
43 439.0223767 492.82529 101.71379365 156.10254755 341.4062362 371.47669455
44   9.4058167  15.61183 196.02932375 142.96609199   0.1845952   0.02051058
45 378.8764115 430.37375  41.09857119  96.48146670 280.5545205 310.37885194
46 259.1066347 315.90070   6.58288623   0.18459522 163.3733499 193.61431541
47 378.8764115 430.37375  41.09857119  96.48146670 280.5545205 310.37885194
48 532.2162468   0.00000   0.00000000   0.00000000   0.0000000   0.00000000
49 532.2162468   0.00000   0.00000000   0.00000000   0.0000000   0.00000000
50   0.0000000 532.21625 532.21624684 532.21624684 532.2162468 532.21624684
51   0.0000000 532.21625 532.21624684 532.21624684 532.2162468 532.21624684
52   0.0000000 532.21625 532.21624684 532.21624684 532.2162468 532.21624684
[1] 26 26
  27   28   29   30   31   32   33   34   35   36   37   38   39   40   41   42 
 1.1  1.2  1.3  1.4  1.5  1.6  1.7  1.8  1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 
  43   44   45   46   47   48   49   50   51   52    1    2    3    4    5    6 
1.17 1.18 1.19 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.24 1.23 1.11 1.13 1.10 1.18 
   7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22 
1.22  1.2 1.16 1.19 1.14  1.6 1.12  1.4 1.21 1.25 1.26 1.15  1.1  1.9 1.17  1.3 
  23   24   25   26 
 1.8 1.20  1.7  1.5 
Warning message:
In fullmatch(x = x, min.controls = controls, max.controls = controls,  :
  Without 'data' argument the order of the match is not guaranteed
    to be the same as your original data.
      1     2     3     4     5     6
27  Inf   Inf  6.15  0.08   Inf   Inf
28  Inf   Inf   Inf  0.33 17.98   Inf
29  Inf   Inf   Inf   Inf   Inf   Inf
30  Inf   Inf 12.29 18.96   Inf   Inf
31  Inf   Inf   Inf   Inf   Inf   Inf
32  Inf   Inf   Inf   Inf   Inf   Inf
33  Inf   Inf   Inf   Inf   Inf   Inf
34  Inf   Inf  0.02  5.09   Inf   Inf
35  Inf   Inf   Inf   Inf   Inf   Inf
36 0.51   Inf   Inf   Inf 10.20 11.17
37  Inf   Inf  0.18   Inf   Inf   Inf
38  Inf   Inf  7.06  0.33   Inf   Inf
39  Inf   Inf  7.57   Inf   Inf   Inf
40  Inf   Inf  6.58  0.18   Inf   Inf
41  Inf   Inf   Inf   Inf   Inf   Inf
42  Inf   Inf  8.13   Inf   Inf   Inf
43  Inf   Inf   Inf   Inf   Inf   Inf
44 9.41 15.61   Inf   Inf  0.18  0.02
45  Inf   Inf   Inf   Inf   Inf   Inf
46  Inf   Inf  6.58  0.18   Inf   Inf
47  Inf   Inf   Inf   Inf   Inf   Inf
48  Inf  0.00  0.00  0.00  0.00  0.00
49  Inf  0.00  0.00  0.00  0.00  0.00
50 0.00   Inf   Inf   Inf   Inf   Inf
51 0.00   Inf   Inf   Inf   Inf   Inf
52 0.00   Inf   Inf   Inf   Inf   Inf
Warning message:
system call failed: Cannot allocate memory 

DOS documentation built on May 1, 2019, 10:32 p.m.

Related to fine in DOS...