# GPSCDF: Generalized Propensity Score Cumulative Distribution Function... In GPSCDF: Generalized Propensity Score Cumulative Distribution Function

## Description

`GPSCDF` takes in a generalized propensity score (GPS) object with length >2 and returns the GPS-CDF balancing score.

## Usage

 ```1 2 3``` ```GPSCDF(pscores = NULL, data = NULL, trt = NULL, stratify = FALSE, nstrat = 5, optimal = FALSE, greedy = FALSE, ordinal = FALSE, multinomial = FALSE, caliper = NULL) ```

## Arguments

 `pscores` The object containing the treatment ordered generalized propensity scores for each subject. `data` An optional data frame to attach the calculated balancing score. The data frame will also be used in stratification and matching. `trt` An optional object containing the treatment variable. `stratify` Option to produce strata based on the power parameter (`ppar`). Default is `FALSE`. `nstrat` An optional parameter for the number of strata to be created when `stratify` is set to `TRUE`. Default is `5` strata. `optimal` Option to perform optimal matching of subjects based on the power parameter (`ppar`). Default is `FALSE`. `greedy` Option to perform greedy matching of subjects based on the power parameter (`ppar`). Default is `FALSE`. `ordinal` Specifies ordinal treatment groups for matching. Subjects are matched based on the ratio of the squared difference of power parameters for two subjects, `ppar_i` and `ppar_j`, in the numerator and the squared difference in observed treatment received, `trt_i` and `trt_j`, in the denominator: `(ppar_i-ppar_j)^2/(trt_i-trt_j)^2`. Default is `FALSE`. `multinomial` Specifies multinomial treatment groups for matching. Subjects are matched based on the absolute difference of power parameters for two subjects, `ppar_i` and `ppar_j`, who received different treatments: `|ppar_i - ppar_j|`. Default is `FALSE`. `caliper` An optional parameter for the caliper value used when performing greedy matching. Used when `greedy` is set to `TRUE`. Default is `.25*sd(ppar)`.

## Details

The `GPSCDF` method is used to conduct propensity score matching and stratification for both ordinal and multinomial treatments. The method directly maps any GPS vector (with length >2) to a single scalar value that can be used to produce either average treatment effect (ATE) or average treatment effect among the treated (ATT) estimates. For the `K` multinomial treatments setting, the balance achieved from each `K!` ordering of the GPS should be assessed to find the optimal ordering of the GPS vector (see Examples for more details).

## Value

 `ppar` The power parameter scalar balancing score to be used in outcome analyses through stratification or matching. `data` The user defined dataset with power parameter (ppar), strata, and/or optimal matching variables attached. `nstrat` The number of strata used for stratification. `strata` The strata produced based on the calculated power parameter (`ppar`). `optmatch` The optimal matches produced based on the calculated power parameter (`ppar`). `optdistance` The average absolute total distance of power parameters (`ppars`) for optimally matched pairs. `caliper` The caliper value used for greedy matching. `grddata` The user defined dataset with greedy matching variable attached. `grdmatch` The greedy matches produced based on the calculated power parameter (`ppar`). `grdydistance` The average absolute total distance of power parameters (`ppars`) for greedy matched pairs.

## Author(s)

Derek W. Brown, Thomas J. Greene, Stacia M. DeSantis

## References

Greene, TJ. (2017). Utilizing Propensity Score Methods for Ordinal Treatments and Prehospital Trauma Studies. Texas Medical Center Dissertations (via ProQuest).

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172``` ```### Example: Create data example N<- 100 set.seed(18201) # make sure data is repeatable Sigma <- matrix(.2,4,4) diag(Sigma) <- 1 data<-matrix(0, nrow=N, ncol=6,dimnames=list(c(1:N), c("Y","trt",paste("X",c(1:4),sep="")))) data[,3:6]<-matrix(MASS::mvrnorm(N, mu=rep(0, 4), Sigma, empirical = FALSE) , nrow=N, ncol = 4) dat<-as.data.frame(data) #Create Treatment Variable tlogits<-matrix(0,nrow=N,ncol=2) tprobs<-matrix(0,nrow=N,ncol=3) alphas<-c(0.25, 0.3) strongbetas<-c(0.7, 0.4) modbetas<-c(0.2, 0.3) for(j in 1:2){ tlogits[,j]<- alphas[j] + strongbetas[j]*dat\$X1 + strongbetas[j]*dat\$X2+ modbetas[j]*dat\$X3 + modbetas[j]*dat\$X4 } for(j in 1:2){ tprobs[,j]<- exp(tlogits[,j])/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) tprobs[,3]<- 1/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) } set.seed(91187) for(j in 1:N){ data[j,2]<-sample(c(1:3),size=1,prob=tprobs[j,]) } #Create Outcome Variable ylogits<-matrix(0,nrow=N,ncol=1,dimnames=list(c(1:N),c("Logit(P(Y=1))"))) yprobs<-matrix(0,nrow=N,ncol=2,dimnames=list(c(1:N),c("P(Y=0)","P(Y=1)"))) for(j in 1:N){ ylogits[j,1]<- -1.1 + 0.7*data[j,2] + 0.6*dat\$X1[j] + 0.6*dat\$X2[j] + 0.4*dat\$X3[j] + 0.4*dat\$X4[j] yprobs[j,2]<- 1/(1+exp(-ylogits[j,1])) yprobs[j,1]<- 1-yprobs[j,2] } set.seed(91187) for(j in 1:N){ data[j,1]<-sample(c(0,1),size=1,prob=yprobs[j,]) } dat<-as.data.frame(data) ### Example: Using GPSCDF #Create the generalized propensity score (GPS) vector using any parametric or #nonparametric model glm<- nnet::multinom(as.factor(trt)~ X1+ X2+ X3+ X4, data=dat) probab<- round(predict(glm, newdata=dat, type="probs"),digits=8) gps<-cbind(probab[,1],probab[,2],1-probab[,1]-probab[,2]) #Create scalar balancing power parameter fit<-GPSCDF(pscores=gps) ## Not run: fit\$ppar ## End(Not run) #Attach scalar balancing power parameter to user defined data set fit2<-GPSCDF(pscores=gps, data=dat) ## Not run: fit2\$ppar fit2\$data ## End(Not run) ### Example: Ordinal Treatment #Stratification fit3<-GPSCDF(pscores=gps, data=dat, stratify=TRUE, nstrat=5) ## Not run: fit3\$ppar fit3\$data fit3\$nstrat fit3\$strata library(survival) model1<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(strata), data=fit3\$data) summary(model1) ## End(Not run) #Optimal Matching fit4<- GPSCDF(pscores=gps, data=dat, trt=dat\$trt, optimal=TRUE, ordinal=TRUE) ## Not run: fit4\$ppar fit4\$data fit4\$optmatch fit4\$optdistance library(survival) model2<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(optmatch), data=fit4\$data) summary(model2) ## End(Not run) #Greedy Matching fit5<- GPSCDF(pscores=gps, data=dat, trt=dat\$trt, greedy=TRUE, ordinal=TRUE) ## Not run: fit5\$ppar fit5\$data fit5\$caliper fit5\$grddata fit5\$grdmatch fit5\$grdydistance library(survival) model3<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(grdmatch), data=fit5\$grddata) summary(model3) ## End(Not run) ### Example: Multinomial Treatment #Create all K! orderings of the GPS vector gps1<-cbind(gps[,1],gps[,2],gps[,3]) gps2<-cbind(gps[,1],gps[,3],gps[,2]) gps3<-cbind(gps[,2],gps[,1],gps[,3]) gps4<-cbind(gps[,2],gps[,3],gps[,1]) gps5<-cbind(gps[,3],gps[,1],gps[,2]) gps6<-cbind(gps[,3],gps[,2],gps[,1]) gpsarry<-array(c(gps1, gps2, gps3, gps4, gps5, gps6), dim=c(N,3,6)) #Create scalar balancing power parameters for each ordering of the GPS vector fit6<- matrix(0,nrow=N,ncol=6,dimnames=list(c(1:N),c("ppar1","ppar2","ppar3", "ppar4","ppar5","ppar6"))) ## Not run: for(i in 1:6){ fit6[,i]<-GPSCDF(pscores=gpsarry[,,i])\$ppar } fit6 #Perform analyses (similar to ordinal examples) using each K! ordering of the #GPS vector. Select ordering which achieves optimal covariate balance #(i.e. minimal standardized mean difference). ## End(Not run) ```

GPSCDF documentation built on May 2, 2019, 9:26 a.m.