In Ngendahimana/sensitivityR5: This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(tidyr);library(ggplot2);library(designmatch);library(MatchIt);library(Matching);library(sensitivitymv);library(sensitivitymw);library(knitr);library(cobalt);library(dplyr)

Introduction.

Over the past few decades, technology has made it easier for researchers to access variety of observational data. Such data include information collected from patient medical records, national surveys et al. Availability of this data provides opportunities for exploring important research questions, however to do so one has be careful in designing studies based on these data. A common use of observation study is in estimating treatment effect. This typically involves defining a variable to measured exposure and then using it to identify a treatment and control group. Since the gold standard for estimating a treatment effect is to use data collected from a randomized clinical trial, observational data has to be preprocessed using observational study design algorithms to ensure data used to make treatment effect is as similar as possible to data that would have been obtained from a clinical trial.

One of the assumptions made when estimating treatment effect from observational studies is what is called the the ignorabiltiy assumption,that is, only the observed covariates are sufficient to explain any meaningful differences between the treatment and control groups. The consequence of this assumption is that if the groups being compared are similar on average, on these observed covariates , then any differences in the outcome being studied is due to treatment- the only distinguishing factor between the groups being compared. The design phase of an observational study is usually focused on achieving the best possible balance of observed covariates between the treated and control group. Several design algorithms have been proposed and R packages that implement them are widely available. In his 2008 paper , Stuart et al provides a detailed description of the major design algorithms and R packages that implement them.

When designing an observation study, its recommended that one try different matching algorithms (i.e.solutions) and pick one that gives the best balance. With various matching algorithms housed in different R packages , for sometime , it was hard to compare across design methods. To easy this process, Greifer et al developed the cobalt package that interacts seamlessly with objects from Matching and MacthIt packages and generate key balance summary statistics. However, it is often not trivial to identity the a matching algorthm to use especially when more than one of those algorithm differ marginally on key balance statistics.

The main critique of observational studies that use matched sampling design methods is that treatment effect estimates obtained are premised on the assumption that observed variables are sufficient to explain the differences between treated and control group. To deal with this criticism , Rosenbaum developed a sensitivity analysis framework that , if a significant treatment effect exists, computes a parameter to quantify a bias magnitude due to unmeasured confounders that would be required to nullify significant results. Presenting results of these sensitivity analysis allows the reader to make an informed decision on how much faith to place in your findings. A significant portion of sensitivity methods developed by Rosenbaum has been implemented in the rboundspackage (Keele,2008). However rbounds functions only interact with objects from the Matching package. The package does not also implement a sensitivity analysis method for time to event outcomes. To perform sensitivity analysis involving matched sets, Rosenbaum developed two packages sensitivitymv and sensitivitymw(Rosenbaum,2013). These two packages also provide more robust test statistic that increase the power of sensitivity. However, to perform sensitivity analysis using these packages require reshaping your data set to conform to a certain format.

Why use this package ?

Select a matched sampling design most robust to unmeasured selection bias from a set of competitive designs.

Recent advancements in computer technology has led to new matched sampling algorithms currently implemented in R. Among the most recent include cardinality and genetic matching. Others that interact with our package such as optimal and nearest neighbor matching are commonly used. When used to to produced matched samples for the same study, the quality of matches produced by each of this algorithms differ primarirly due to the fact that when forming matches each of this algorithms seeks to optimize different objective functions. This difference in quality of matches produced impacts the design of a study in terms of its overall robustness to unmeasured selection bais. Specifically, designs produced by these algorithms differ in terms of their robustness to hidden bias.The workhorse function of this package automates a selection strategy to guide selection of the most robust design from a set of designs based on various competitive algorithms.

Easy sensitivity analyisis when matched sampling is not necessary done with `Matching` package.

Although there are several R packages that implement different matched sampling algorithms, there is no packages that interacts universally with these matching package to implement sensitivity to hidden bias analysis. Currently to perform sensitivity analysis to hidden bias using objects from some matching packages such MatchiIt or designmatch can be time consuming and complicated depending on ones confort with R programming. The rbounds package which is commonly used to perfom sensitivity analysis currently accepts objects from Matching package only. This package gives the user the freedom to perform sensitivity analysis using output from any matching package by simply passing the object to a dedicated function of this package.

Sensitivity analysis for matched sets from any package.

Unlike rbounds which implements sensitivity analysis for matched pairs only, this package contains functions which accept objects from various matching packages in case matched sets rather than matched pair are desirable for a given design. Specific functions in this package transforms and performs sensitivity analysis using these objects from different matching packages. Using this package can save time required to transform objects from matching packages into valid inputs for sensitivitymv or sensitivitymw, the two packages that perform sensitivity analysis for matched sets.

Balance and Sensitivity Analysis Plots.

Most matching packages contain plots and tables useful in judging quality of matches obtained after matching, the most common being the love plot (Thomas Love). This package adds to these tools by providing sensitivity analysis plots(Rosenbaum ,2013) and plots to assess balance recommendations advocated by Rubin. These plots are meant to help visualize and interpret findings.

Using the package.

This package contains four main functions, ds_function , pens2(), binarysens2, t2event(),rubinrules(). Other auxilliary functions in the package include ampPlot() and eda(). In the sections that follow , a detailed description of how each fucntion can be used is given. It is important to note that this package does not replace the highly sophisticated matching and sensitivity tools in the packages described above. What it seeks to accomplish is to bridge the most common matching packages to various sensitivity packages where such connections do not exist and would require more work on the part of the analyst. Currently this package is housed on github therefore to install it use the following code:

library(devtools)
devtools::install_github("Ngendahimana/SensitivityR5")
library(SensitivityR5)

`ds_function`

This function accepts matching objects from MatchIt package which implements various matching aglorithms such optimal, full and nearest neighbour matching. It also accepts objects from the designmatch that implements cardinality matching. Using a list of objects passed to it ds_function estimates the design sensitivity parameter to appraise matching algorithms with a portion of the matched samples produced by algorithms. Rosenbaum details the computation of design sensitivity which are implemented by this function.

To illustrate the use of this function, we simulate a dataset

rubinRules

rubinRules() is a function that can be use to evaluate the three balance meusures suggested by Rubin (2001) and are based on the theory published in another paper by Rubin and Thomas (2006). Rubin Provides three guidelines to ensure that the results of regression adjustment - often conducted in the analysis phase of observational studies to clear any residual imbalances - are valid. The three guidelines are:

Rule 1: The absolute standardized difference of means of the propensity score should be less than 0.25.
Rule 2: The ratio of the variances of the propensity score in the treated and control groups should be between 0.5 and 2
Rule 3: For each covariate, the ratio of the variance of the residuals orthogonal to the propensity score in the treated and control groups should be between 0.5 and 2.

An example of how to use the rubinRules function to assess these guidelines is illustrated below

data("toy",package = "SensitivityR5")
psmodel <- glm(treated ~ covA + covB + covC + covD + covE + covF + Asqr + BC + BD, family=binomial(), data=toy)
toy$ps <- psmodel$fitted
toy$linps <- psmodel$linear.predictors
covlist1=c('covA', 'covB', 'covC', 'covD', 'covE', 'covF.Middle', 'covF.High', 'Asqr','BC', 'BD')
k =rubinRules2(data=toy,Treatment='treated',covlist=covlist1)
k$plot

The first two Rubin balance meausures are displayed at the top of the plot while the third balance measure which consists of series of residual variance ratios for variable are displayed inside the dot plot. The blue lines indicate acceptable range for the residual variance ratios. Ratio outside region bounded by the red lines are considered unacceptable. This plot is a ggplot object and can be modified like any ggplot graph. For example to modify the plot title and background, the following code could be executed:

k$plot+theme_bw()+labs(title = "Assessing Rubin (2010) Balance Measures")

Various items can be called off of the list (k) using conventional R methods. For example to obtain a dataset of residual variance ratio , you can do the following.

k$RUBIN3

Love Plot

Although Love plot for assessing standardized balance measure for objects produced by Matching and MatchIt has already been implemented in Cobalt package through the Love.plot function, this function cannot interact with designmatch object directly. On the other hand the love plot object currently implemented in designmatch package is not a ggplot objects and therefore makes it hard to customize as desired. The Love plot function in this package allows you to generate ggplot Love figures using objects from any of the three matching packages including designmatch package.

The code chunk below shows how to accomplish this.

Using `designmatch` objects

This example uses to design match which implements cardinality matching which balances variables directly rather than using a summary measure to quantify the distance between observations. This matching optimizes on the sample size rather than simply a distance measure like propensity score. designmatchingpackage also enables other more sophisticated matching algorithms such as fine, near-fine balancing, exact matching and near exact matching. These algoritms can be appealing in various circumstances, especially when the researcher wishes to incorporate subject matter understanding of the question at hand.

data("lalonde",package = "designmatch")
attach(lalonde)
## Treatment indicator
t_ind =lalonde$treat

## Distance matrix
dist_mat = NULL

## Subset matching weight.
subset_weight = 1 

# Moment balance: constrain differences in means to be at most .05 standard deviations apart
mom_covs = cbind(age, education, black, hispanic, married, nodegree, re74, re75)
mom_tols = round(absstddif(mom_covs, t_ind, .05), 2)
mom = list(covs = mom_covs, tols = mom_tols)

## Fine balance
#fine_covs = cbind(black, hispan, married, nodegree)
#fine = list(covs = fine_covs)

## Exact matching
#exact_covs = cbind(black)
#exact = list(covs = exact_covs)

## Solver options
t_max = 60*5
solver = "glpk"
approximate = 1
solver = list(name = solver, t_max = t_max, approximate = approximate,round_cplex = 0, trace = 0)

## Cardinality matching
out = bmatch(t_ind = t_ind, dist_mat = dist_mat, subset_weight = subset_weight, mom = mom,  solver = solver)

# Indices of the treated units and matched controls
t_id = out$t_id
c_id = out$c_id
#detach(lalonde)

detach(lalonde)

To assess balance with love_plot function in this package after cardinality matching you would need to execute the following code

p = love_plot(X =out, data = lalonde , covList=c("age", "education", "black", "hispanic", "married", "nodegree", "re74", "re75"))

p

Note that out is an object from the designmatch package that implements cardinality matching.

unlike the existing love plot in designmatch which does not allow one to modify its appearence, this plot can be enhanced using ggplot2 features. For example we can add desired standardized difference threshhold lines , say, from -0.1 to 0.1 and remove the legend title with the following code:

p+theme_bw()+geom_vline(xintercept = -0.1)+ geom_vline(xintercept = 0.1) +theme(legend.title = element_blank())

Object from MatchIt and Matching packages can also be directly used with the love_plot function. You can also use the love plot function to extrac other important balance measures. We illustrate this using optimal matching functionality within the MatchiItpackage...

Sensitivity Analysis.

As mentioned earlier, one of the aims of this package is to facilitate seamless transition between analysis of an observational study to perfoming sensitivity analysis. In the section that follow we illustrate how to use functions within this package to accomplish this task.

`pens2` Function.

This function is meant to facilitate sensitivity analysis for continous outcomes. There is a similar function in rbounds package that implements this method howevever rbounds interacts only with Matching package objects which implements limited methods of matched sampling.

Using lalonde dataset , the code below illustrates how to do sensitivity analysis after implementing an optimal matching algorithm.

library(Matching);library(MatchIt)
data("lalonde",package = "Matching")

## Sensitivity analysis with a matchit object
m.out = matchit(treat ~ age  + educ +  black + hisp +married + nodegr + re74  + re75  +
u74 + u75, family=binomial, data = lalonde, method = "optimal")

## Estimating treatment effect. Ideally, balance assessement should be done prior to estimating treatment effect
mod = lm(re78~age  + educ +  black + hisp +married + nodegr + re74  + re75  +u74 + u75,data = match.data(m.out))

## Sensitivity analysis
sens.out =pens2(x = m.out, y="re78",Gamma = 2, GammaInc = 0.1,est = 629.7)

Note that when using MatchIt objects you have to specify the effect estimate or put -1 if there a negative effect estimate and 1 otherwise. After executing this code, you can access a table of sensitivity parameters and their respective Rosenbaum bounds by running the code below:.

kable(sens.out$bounds)

The other advantage of using this package is that it allows you to used objects processed by designmatch package to run sensitivity analysis. This especially important when one wants to implemented more sophisticated matching methods that are not available with either MatchtIt or Matching package such allowing for near fine matching or fine matching. For example to run a sensitivity analysis using the design match object out created earlier on and obtain Rosenbaum sensitivity bounds, the following code would be used:

data("lalonde",package = "designmatch")
attach(lalonde)
## Treatment indicator
t_ind = treatment
## Distance matrix
dist_mat = NULL
## Subset matching weight
subset_weight = 1
## Moment balance: constrain differences in means to be at most .05 standard deviations apart
mom_covs = cbind(age, education, black, hispanic, married, nodegree, re74, re75)
mom_tols = round(absstddif(mom_covs, t_ind, .05), 2)
mom = list(covs = mom_covs, tols = mom_tols)
## Fine balance
fine_covs = cbind(black, hispanic, married, nodegree)
fine = list(covs = fine_covs)
## Exact matching
exact_covs = cbind(black) 
exact = list(covs = exact_covs)
## Solver options
t_max = 60*5
solver = "glpk"
approximate = 1
solver = list(name = solver, t_max = t_max, approximate = approximate,round_cplex = 0, trace = 0)
## Match
out = bmatch(t_ind = t_ind, dist_mat = dist_mat, subset_weight = subset_weight,mom = mom, fine = fine, exact = exact, solver = solver)

sens2 = pens2(x = out, y="re78",Gamma = 2, GammaInc = 0.1,est = 234,treat = "treatment",data = lalonde)
kable(sens2$bounds)

detach(lalonde)

`Survsens` function.

At the time of writing this vignette there are no sensitivity package with a function that implements sensitivity analysis for time to event outcomes. Survsens performs this analysis and is largerly based on lecture notes provided by Prof.Thomas Love in his travelling course on design of observational studies. We illustrate how to use this function using made up data called toy that is part of this package. To run sensitivity analysis with this function, you first have to perfom matching using a matching package of your choice and assess quality of your match. For example we can first match using MatchIt package with the following code

data("toy",package = "SensitivityR5")
psmodel <- glm(treated ~ covA + covB + covC + covD + covE + covF + Asqr + BC + BD, family=binomial(), data=toy)
toy$ps <- psmodel$fitted
toy$linps <- psmodel$linear.predictors
X <- toy$linps ## matching on the linear propensity score
Tr <- as.logical(toy$treated)
Y <- toy$out3.time
match1 <- Match(Y=Y, Tr=Tr, X=X, M = 1, replace=FALSE, ties=FALSE)

match.it <- matchit(treated ~ covA + covB + covC + covD + covE + covF + Asqr + BC + BD, data = toy, method='nearest', ratio=1)

After assessing the quality of matches and estimating treatment effect, you can then do a sensitivity analysis with SurvSens function using the code similar to the one below to extract Rosenbaum bounds table.

res_Surv =Survsens(x= match.it,data =toy,exp='treated',outcome = 'out2',failtime = 'out3.time',Gamma=1.2,Gammainterval = 0.01,alpha ,plot_title = 'Time To Event Outcome Sensitivity Plot')

kable(res_Surv$bounds)

`MultiControlSens` function

Currently sensitivitymw is the only package that implements sensitivity analysis for in matched sampling designs that allow matched sets rather than simply matched pair. The MultiControlSens complements the sensitivitymw package by processing objects from various matching packages to allow seamless interactions during sensitivity analysis. For example, using the lalonde data , if we match 2 controls to 1 treated using the optimal option in matchit function , we can perform sensititivity analysis using the following code:

data("lalonde",package ="designmatch")
covs0 = c("age", "education", "black", "hispanic", "married", "nodegree", "re74", "re75")
m.out <- matchit(f.build("treatment", covs0), data = lalonde, method = "nearest", replace = TRUE,ratio = 2)
res1 = multiControlSens(X =m.out,outcomeName = "re78",Gamma = 2,GammaInc = 0.1,n_contrl = 2)

This function returns Rosenbaum sensitivity bounds and a reformated dataset that can be used for other analyses. The first few rows of the data returned by the function are shown below

kable(head(res1$data,5))

Rosenbaum bounds can also be exctracted

kable(res1$pvalues)

From the table above we observe designing an observational study such that 2 control can be matched to a single case then the odds of occurence of an unobserved variable in one exposure group would have to 1.3 the odds of occurence in the other exposure group to nullify any qualitative findings of significance treatment difference between these exposure groups.

`binarysens2` function.

The rbounds package has implimented method developed by Rosenbaum for binary outcomes sensitivity analyis using the binarysens function. In fact , binarysens2 gets most of its coding inspiration from binarysens2. What this function adds on is the ability to do sensitivity analysis using other algorithms that binarysensis currently unable to intaract with. In the sections that follow we illustrate how to use binarysens2 with objects from various packages.

Using `matchIt` objects.

data("GerberGreenImai",package = "Matching")
## Sensitivity analysis with a Match object
m.out = matchit(PHN.C1 ~ PERSONS + VOTE96.1 + NEW +MAJORPTY + AGE + WARD , family=binomial, data = GerberGreenImai, method = "nearest")
mod = lm(VOTED98 ~ PHN.C1+PERSONS + VOTE96.1 + NEW +MAJORPTY + AGE + WARD,data = match.data(m.out))
binarysens2(x=m.out,y ="VOTED98", Gamma=2, GammaInc=.1)

The graph above plots the p-value upper bounds corresponding to various $\Gamma$ parameters. The cordinates displayed on the plot refers tot the maximum bias allowable under a specified significance level ($\alpha$) beyond which the significant findings of true treatment effect can be attributed to unmeasured biases.

`pens2` function

This function is meant to facilitate sensitivity analysis when you have a continous outcome. Again, unlike the current function in rbounds, this package is meant to handle objects from multiple matching algorithms including cardinality matching. In the sections that follow we illustrate how to use it to perform sensitivity analysis with objects from MatchIt and designMatch packages.

with `MatchIt` objects

We use the lalonde data in this illustration. Suppose you perfom an optimal match on observed covariates on, then you would execute the following code.

data("lalonde",package = "Matching")
m.out = matchit(treat ~ age  + educ +  black + hisp +married + nodegr + re74  + re75  +
u74 + u75, family=binomial, data = lalonde, method = "nearest")

Ideally, the next step, which we skip here, would be to assess balance of baseline covariates using tools such as the rubins plot and love plots and finally estimate an effect. To perfom sensitivity analysis with pens2, you then need to pass a matchIt object, the estimated treatment effect, name of the outcome variable and the maximum value of the Rosenbaum sensitivity parameter.

mod = lm(re78~age  + educ +  black + hisp +married + nodegr + re74  + re75  +u74 + u75,data = match.data(m.out))

sens_object = pens2(x = m.out, y="re78",Gamma = 2, GammaInc = 0.1,est = 629.7)

The pens2 function returns a list of items that include a table of bounds for the range of sensitivity parameters passed to the function. To object the table, you can call it from the list as follows

kable(sens_object$bounds)

with `designMatch` objects.

If on the other hand you have chosen to do cardinality matching, which optimizes on sample size under constraints of covariate balance, then you would follow the same steps as in section above. First perfom matching with the designMatch package;

data("lalonde",package = "designmatch")
library(designmatch)
attach(lalonde)
## Treatment indicator
t_ind = treatment
## Distance matrix
dist_mat = NULL
## Subset matching weight
subset_weight = 1
## Moment balance: constrain differences in means to be at most .05 standard deviations apart
mom_covs = cbind(age, education, black, hispanic, married, nodegree, re74, re75)
mom_tols = round(absstddif(mom_covs, t_ind, .05), 2)
mom = list(covs = mom_covs, tols = mom_tols)
## Fine balance
fine_covs = cbind(black, hispanic, married, nodegree)
fine = list(covs = fine_covs)
## Exact matching
exact_covs = cbind(black)
exact = list(covs = exact_covs)
## Solver options
t_max = 60*5
solver = "glpk"
approximate = 1
solver = list(name = solver, t_max = t_max, approximate = approximate,round_cplex = 0, trace = 0)
## Match
out = bmatch(t_ind = t_ind, dist_mat = dist_mat, subset_weight = subset_weight,mom = mom, fine = fine, exact = exact, solver = solver)

Note in this matching scheme, we have specified that we are fine matching on marital status,race e.t.c. These specifications are all optional. After matchin , you can then call the pens2to perform sensitivity analysis.

cardMatch = pens2(x = out, y="re78",Gamma = 2, GammaInc = 0.1,est = 234,treat = "treatment",data = lalonde)

kable(head(cardMatch$bounds,5))
detach(lalonde)

Ngendahimana/sensitivityR5 documentation built on June 24, 2020, 4:09 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Ngendahimana/sensitivityR5
This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

In Ngendahimana/sensitivityR5: This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

Introduction.

Why use this package ?

Select a matched sampling design most robust to unmeasured selection bias from a set of competitive designs.

Easy sensitivity analyisis when matched sampling is not necessary done with `Matching` package.

Sensitivity analysis for matched sets from any package.

Balance and Sensitivity Analysis Plots.

Using the package.

`ds_function`

rubinRules

Love Plot

Using `designmatch` objects

Sensitivity Analysis.

`pens2` Function.

`Survsens` function.

`MultiControlSens` function

`binarysens2` function.

Using `matchIt` objects.

`pens2` function

with `MatchIt` objects

with `designMatch` objects.

R Package Documentation

Browse R Packages

We want your feedback!

Ngendahimana/sensitivityR5 This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

In Ngendahimana/sensitivityR5: This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

Introduction.

Why use this package ?

Select a matched sampling design most robust to unmeasured selection bias from a set of competitive designs.

Easy sensitivity analyisis when matched sampling is not necessary done with Matching package.

Sensitivity analysis for matched sets from any package.

Balance and Sensitivity Analysis Plots.

Using the package.

ds_function

rubinRules

Love Plot

Using designmatch objects

Sensitivity Analysis.

pens2 Function.

Survsens function.

MultiControlSens function

binarysens2 function.

Using matchIt objects.

pens2 function

with MatchIt objects

with designMatch objects.

R Package Documentation

Browse R Packages

We want your feedback!

Ngendahimana/sensitivityR5
This package supports implementation of sensitivity analyisis using objects from Match, MatchIt and bmatch

Easy sensitivity analyisis when matched sampling is not necessary done with `Matching` package.

`ds_function`

Using `designmatch` objects

`pens2` Function.

`Survsens` function.

`MultiControlSens` function

`binarysens2` function.

Using `matchIt` objects.

`pens2` function

with `MatchIt` objects

with `designMatch` objects.