knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
To install the package, please use:
library(devtools) install_github("doomlab/ViSe")
This installation will give you the latest version of the package, regardless of status on CRAN.
library(ViSe) library(cowplot) library(ggplot2) library(plotly)
To demonstrate the use of visual sensitivity analysis, we use the effect of child maltreatment on the extent of mental health problems in terms of internalising and externalising behaviour. The study of Kisely and colleagues (Kisely et al., 2018) was based on a general population sample in Brisbane, Australia, and compared 3554 mother-child pairs without 'substantiated child maltreatment' to, for example, 73 pairs with child neglect. Note that the results vary across different types of maltreatment assessed, we choose child neglect because its results (a smaller but still considerable l after adjustment) are particularly suited to illustrating sensitivity analysis. Maltreatment was assessed 'by linkage to state child protection agency data'. Internalising and externalising behaviours were measured using the Youth Self-Report (YSR) scale (Achenbach & Rescorla, 2001) at around the age of 21. The study reports unadjusted mean differences and mean differences adjusted for ‘gender, parental ethnicity, maternal age, family income, maternal relationship status, maternal education, youth income level, youth education level, youth marital status’ (e.g., likely based on ordinary least squares regression, but the paper does not specify).
See supplemental document at https://static.cambridge.org/content/id/urn:cambridge.org:id:article:S0007125018002076/resource/name/S0007125018002076sup001.docx
To obtain the estimates and two-tailed 95% confidence intervals on the effect size d scale (via d = $M_{difference}$ / $SD_{Total}$, similar to Glass' $\Delta$), we divided the reported unadjusted and adjusted mean differences for internalising and externalising by the respective standard deviations of the total sample.
internal_unadjusted <- list( mean_diff = 3.68, lower_diff = 1.73, upper_diff = 5.62, sd = 8.29 ) internal_adjusted <- list( mean_diff = 2.73, lower_diff = 0.77, upper_diff = 4.69, sd = 8.28 ) external_unadjusted <- list( mean_diff = 3.72, lower_diff = 2.13, upper_diff = 5.32, sd = 6.81 ) external_adjusted <- list( mean_diff = 3.10, lower_diff = 1.49, upper_diff = 4.71, sd = 6.81 ) list_values <- list(internal_unadjusted, internal_adjusted, external_unadjusted, external_adjusted) list_names <- c("internal_unadjusted", "internal_adjusted", "external_unadjusted", "external_adjusted") names(list_values) <- list_names for (i in list_names){ list_values[[i]][["d"]] <- list_values[[i]][["mean_diff"]] / list_values[[i]][["sd"]] list_values[[i]][["lower_d"]] <- list_values[[i]][["lower_diff"]] / list_values[[i]][["sd"]] list_values[[i]][["upper_d"]] <- list_values[[i]][["upper_diff"]] / list_values[[i]][["sd"]] } unlist(list_values)
To visualise the lower level confidence interval l, we must obtain the lower bound of the confidence interval for our proposed effect size. ViSe includes a function calculate_d()
that calculates from t.test()
output, dataframes, individual vectors of data, sample statistics (M, SD, n for group), t-test values, or a pre-calculated d-value.
In this example, we have our d value from the original research, along with sample sizes from the study which can be used to calculate the two-tailed dlow_central
or one-tailed done_low_central
lower confidence interval. The package vignette shows all possible ways to calculate values from data and includes the non-centralized confidence intervals for effect size d as well (below).
internal_unadj_output <- calculate_d( d = list_values$internal_adjusted$d, # d value a = .05, # alpha for confidence interval lower = TRUE, # you expect d to be positive n1 = 71, # sample size group 1 n2 = 3653 # sample size group 2 ) internal_unadj_output$dlow_central internal_unadj_output$done_low_central # note, the program also provide noncentral t confidence intervals # in this case, they are unusable because d has been calculated from # mean difference / control rather than mean difference / spooled # therefore the approximation of t and the noncentral # limits is not appropriate
calculate_d
# from dataframe calculate_d( df = mtcars, x_col = "am", y_col = "hp", a = .05, lower = TRUE ) # from two columns x <- mtcars$am y <- mtcars$hp calculate_d( x_col = x, y_col = y, a = .05, lower = TRUE ) # from summary statistics calculate_d(m1 = 14.37, # neglect mean sd1 = 10.716, # neglect sd n1 = 71, # neglect n m2 = 10.69, # none mean sd2 = 8.219, # none sd n2 = 3653, # none n a = .05, # alpha/confidence interval lower = TRUE) # lower or upper bound # from t-test model output <- t.test(mtcars$hp ~ mtcars$am, var.equal = TRUE) n_values <- tapply(mtcars$hp, mtcars$am, length) calculate_d( model = output, n1 = unname(n_values[1]), n2 = unname(n_values[2]), a = .05, lower = TRUE ) # from t-values calculate_d( t = 1.37, n1 = unname(n_values[1]), n2 = unname(n_values[2]), a = .05, lower = TRUE )
Not all research studies use d as the effect size of interest, and ViSe
provides functionality to convert between effect sizes. The other_to_d()
function can be used to convert effect sizes f, $f^2$, NNT, r, probability of superiority, U1, U2, U3, and proportional overlap into d.
other_to_d(nnt = 35)
All options of other_to_d()
:
f = NULL, f2 = NULL, nnt = NULL, r = NULL, prob = NULL, prop_u1 = NULL, prop_u2 = NULL, prop_u3 = NULL, prop_overlap = NULL
The effect size d value can then be used to visualize all effects and their conversions at once in the visualize_effects()
function. You can use the percent, color, and font family options to adjust the resulting graph for readability and color scheme.
visualize_effects(d = list_values$internal_adjusted$d, circle_color = "lightblue", circle_fill = "gray", percent_color = "darkblue", percent_size = 10, text_color = "black", font_family = "Times") # note graphs look better scaled, try saving them # ggsave(filename = "visualize_effects.png") # you can make very ugly graphs if you want visualize_effects(d = .2, circle_color = "green", circle_fill = "orange", percent_color = "pink", percent_size = 20, text_color = "purple", font_family = "Arial")
This package uses the following conversion functions that can be implemented separately as well:
d_to_r(d = list_values$internal_adjusted$d) d_to_f2(d = list_values$internal_adjusted$d) d_to_nnt(d = list_values$internal_adjusted$d) probability_superiority(d = list_values$internal_adjusted$d) proportion_overlap(d = list_values$internal_adjusted$d)
Now that we have an idea of our d value, the lower confidence limit l, and other potential effect sizes, we can create a sensitivity plot of the effect size to bias. To visualize different d values and their representation of the distribution overlap, use estimate_d()
to examine different effect sizes:
estimate_d(d = .09, fill_1 = "red", fill_2 = "blue", text_color = "black")$graph
Similarly, estimate_r()
shows a scatterplot of the entered correlation coefficient to visualize the relation between two variables:
estimate_r(r = .30)$graph
You can then use the two estimation functions to create a graph that shows you which combinations of effect size and r would indicate a causal effect. You can enter any effect size from our conversion options, and these will be converted to d with labels for inspection. The shaded area represents an area that would be considered the effect, and the points represent the entered combination of r and effect size. Points in the shaded area would be considered sensitive to bias.
You can use the plotly
library to hover over those dots to see their values (also embedded in our shiny app). Note: Not run to make this package CRAN compatible (plotly makes html files large).
visual_c_mapped <- # your lower confidence limit required visualize_c_map(dlow = list_values$internal_adjusted$lower_d, # correlation values required r_values = c(.1, .4, .3), # other effect sizes you want to plot d_values = c(.2, .8), nnt_values = c(60), # if you think d will be positive lower = TRUE, # as many values as the max number effects point_colors = c("red", "green", "blue"), # a size for the shapes size = 2, # shape 1 shape_1 = 2, # shape 2, make these the same number if you # want the shapes overlapping # we think two different ones helps readability shape_2 = 3, # color of the background highlighted area ribbon_color = "lightblue" ) visual_c_mapped$graph ggsave(filename = "visualize_c_map.png", width = 8, height = 6, dpi = 300) # ggplotly(visual_c_mapped$graph)
All options for the graph, including their defaults:
visualize_c_map( dlow, r_values, d_values = NULL, f_values = NULL, f2_values = NULL, nnt_values = NULL, prob_values = NULL, prop_u1_values = NULL, prop_u2_values = NULL, prop_u3_values = NULL, prop_overlap_values = NULL, lower = TRUE, point_colors = c("red", "green", "blue"), size = 2, shape_1 = 2, shape_2 = 3, ribbon_color = "lightblue" )
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.