knitr::opts_chunk$set(echo = TRUE) library(psychmeta)
psychmeta is a comprehensive and open-source R package for computing psychometric meta-analyses, conducting follow-up analyses of meta-analytic results (e.g., cumulative meta-analyses, leave-one-out meta-analyses, bootstrapped meta-analyses, and meta-regressions [with help from
metafor]), simulating data impacted by psychometric artifacts, and much more! This tutorial is meant to introduce you to some of the most critical features of
psychmeta and familiarize you with workflows within the package. We'll start with the basics (installation and data import) and work our way up to the more techncial topics.
Install the official CRAN version of the package:
Or install the GitHub verson (with incremental updates between CRAN releases):
Load the package library:
Recommended packages for data import:
rio(convenient import of many formats; relies on
data.tableand can have isues with date variables)
readr(many efficient functions for loading delimited datafiles; e.g., CSV)
haven(imports and exports SPSS, SAS, and Stata files)
readxl(for Excel files)
psychmeta requires data to be in a "long" format. This means that each effect size is coded in its own row of a matrix. For example, a long format looks like this:
# First 10 observations from the the database of Gonzalez-Mulé, Mount, and Oh (2014; JAP) head(data_r_gonzalezmule_2014[,c("Study", "n", "rxyi", "Rating source", "rxxi", "ryyi", "ux")])
Not all data come in a long format, so data manipulation tools can be helpful to convert it to a long format.
Recommended packages for data manipulation:
dplyr(flexible package for fast, intuitive data manipulation)
tidyr(functions to convert between wide and long data)
psychmeta can also help with reshaping data!
Sometimes it can be quick to code data in a journal table-like format:
dat_matrix <- data.frame(var_names = c("X", "Y", "Z"), n = c(100, 100, 100), mean = c(4, 5, 3), sd = c(2.4, 2.6, 2), rel = c(.8, .7, .85), reshape_vec2mat(cov = c(.3, .4, .5), var_names = c("X", "Y", "Z"))) rownames(dat_matrix) <- NULL
This matrix-style format can be easily converted to the long-format data
psychmeta requires with the
reshape_mat2dat( var_names = "var_names", # Column of variable names cor_data = c("X", "Y", "Z"), # Names of correlation columns common_data = "n", # Names of columns shared among relationships unique_data = c("mean", "sd", "rel"), # Names of columns unique to relationships data = dat_matrix)
Sometimes it can also be quick to code data with multiple correlations in one row (i.e., a wide format):
dat_wide <- data.frame(sample_id = c(1, 2, 3), ni = c(66, 74, 93), rxyi_X_Y = c(-.29, -.16, -.34), rxyi_X_Z = c(.18, .18, .02), rxyi_Y_Z = c(.15, .00, .00), rel_X = c(.95, .93, .97), rel_Y = c(.85, .86, .82), rel_Z = c(.91, .90, .89))
This wide format can be easily converted to the long format
common_vars <- c("sample_id") # Column names for variables common to all # relationships var_names <- c("X", "Y", "Z") es_design = matrix(NA, 3, 3) # Matrix containing the column names es_design[lower.tri(es_design)] <- # for the intercorrelations among variables c("rxyi_X_Y", "rxyi_X_Z", "rxyi_Y_Z") # in the lower triangle of the matrix rownames(es_design) <- colnames(es_design) <- var_names n_design <- "ni" # Sample size column name or es_design-like # matrix other_design <- # Matrix with variable names as row names, cbind(rel = c("rel_X", # names of long-format variables as column names, "rel_Y", # and column names of dat_wide as elements "rel_Z")) rownames(other_design) <- var_names reshape_wide2long(common_vars = common_vars, es_design = es_design, n_design = n_design, other_design = other_design, es_name = "rxyi", # Type of effect size in dat_wide data = dat_wide)
convert_es() function can be used to convert a wide variety of statistics to the r or d metric.
convert_es(es = 1, input_es = "d", output_es = "r", n1 = 50, n2 = 50) convert_es(es = -1.3, input_es = "t", output_es = "r", n1 = 100, n2 = 140) convert_es(es = 10.3, input_es = "F", output_es = "r", n1 = 100, n2 = 150) convert_es(es = 1.3, input_es = "chisq", output_es = "r", n1 = 100, n2 = 100) convert_es(es = .021, input_es = "p.chisq", output_es = "r", n1 = 100, n2 = 100) convert_es(es = 4.37, input_es = "or", output_es = "r", n1 = 100, n2 = 100) convert_es(es = 1.47, input_es = "lor", output_es = "r", n1 = 100, n2 = 100) convert_es(es = .2, input_es = "r", output_es = "d", n1 = 50, n2 = 50) convert_es(es = -1.3, input_es = "t", output_es = "d", n1 = 100, n2 = 140) convert_es(es = 10.3, input_es = "F", output_es = "d", n1 = 100, n2 = 150) convert_es(es = 1.3, input_es = "chisq", output_es = "d", n1 = 100, n2 = 100) convert_es(es = .021, input_es = "p.chisq", output_es = "d", n1 = 100, n2 = 100) convert_es(es = 4.37, input_es = "or", output_es = "d", n1 = 100, n2 = 100) convert_es(es = 1.47, input_es = "lor", output_es = "d", n1 = 100, n2 = 100)
To include converted effect sizes in a meta-analysis, merge output of
convert_es() into your data frame, at least the the columns for the converted
effect sizes (
AUC), sample size (
n), effective sample size (
convert_es(es = c(.4, .3, .25), input_es = "r", output_es = "d", n1 = c(50, 110, 65), n2 = c(50, 70, 65) )
Corrections for measurement error and range restriction can be applied automatically within meta-analyses.
Other corrections can be made, but they need to be applied sporadically (i.e., applied to individual effect sizes before including them in a meta-analysis).
correct_r_dich(r = c(.3, .5), px = .5, py = .5, n = 100) correct_r_split(r = c(.3, .5), pi = .2, n = 100)
ma_r()is the general-purpose meta-analysis function for correlations
The arguments to this function are:
ma_r( #### Commonly used arguments: #### # Which data set would you like to use? data = NULL, # Specify essential effect-size information. rxyi, n, n_adj = NULL, # Differentiate sources of data. sample_id = NULL, citekey = NULL, # Specify methodological parameters. ma_method = "bb", ad_type = "tsa", correction_method = "auto", # Specify constructs and measures of constructs. construct_x = NULL, construct_y = NULL, measure_x = NULL, measure_y = NULL, # Weighting method wt_type = "sample_size", # Correct for small-sample bias? correct_bias = TRUE, # Correct for measurement error? correct_rxx = TRUE, correct_ryy = TRUE, # Correct for range restriction? correct_rr_x = TRUE, correct_rr_y = TRUE, # Is the range restriction indirect (vs. direct) in nature? indirect_rr_x = TRUE, indirect_rr_y = TRUE, # What are your reliability coefficients? rxx = NULL, ryy = NULL, # Are your reliability coefficients range restricted? rxx_restricted = TRUE, ryy_restricted = TRUE, # What types of reliability estimates are you using? rxx_type = "alpha", ryy_type = "alpha", # What are your range-restriction u (SD) ratios? ux = NULL, uy = NULL, # Are your u ratios computed from observed (vs. true-score) SDs? ux_observed = TRUE, uy_observed = TRUE, # What are your moderators and which ones are categorical? moderators = NULL, cat_moderators = TRUE, # What type of moderator analysis do you want (simple vs. hierarchical)? moderator_type = "simple", # If correcting for bivariate indirect RR, how do constructs correlate # with selection mechanisms? (only need to specify the +1 or -1 sign of the relationships) sign_rxz = 1, sign_ryz = 1, # Do you have any artifact distributions to include that are not in your database? supplemental_ads = NULL, #### Other arguments to know about: #### # Specify the order in which constructs should be displayed in output. construct_order = NULL, # If analyzing multiple relationships, how should each constructs's measurement # error be handled? correct_rel = NULL, # If analyzing multiple relationships, how should each construct's # range restriction be handled? correct_rr = NULL, # If analyzing multiple relationships, which constructs are affected by indirect # range restriction? indirect_rr = NULL, # If analyzing multiple relationships, how does each construct correlate with # the selection mechanism? sign_rz = NULL, # Additional methological parameters can be modified using the "control" argument. control = control_psychmeta() )
In addition to the arguments used with
ma_r() directly, the
control argument allows users to modify methological parameters of their analyses and interact with some of the more specialized features of
psychmeta's meta-analysis functions. The
control argument in meta-analysis functions accepts the output of the
control_psychmeta( # Should the mean or the sample-specific effect sizes be used to estimate error variance? error_type = c("mean", "sample"), # What proportion of the distribution of means should be included in confidence intervals? conf_level = .95, # What proportion of the residual distribution of observations should be included in credibility intervals? cred_level = .8, # How should confidence and credibility intervals be computed? With the t or normal distribution? conf_method = c("t", "norm"), cred_method = c("t", "norm"), # Should weighted variance estimates be computed as unbiased (i.e., multiplied by k / [k-1])? var_unbiased = TRUE, # Should overall artifaction distributions be computed for each construct (pairwise_ads == FALSE) or should # artifact distributions be computed separately for each construct pair (pairwise_ads == TRUE)? pairwise_ads = FALSE, # Should artifact distributions be computed by collapsing across moderator # levels (moderated_ads == FALSE) or should artifact distributions be computed # separately for each moderator combination (moderated_ads == TRUE)? moderated_ads = FALSE, # Should artifact-distribution corrections be computed using artifact distributions # that have had sampling error removed (residual_ads == TRUE) or should the observed # distributions be used (residual_ads == TRUE)? residual_ads = TRUE, # Should dependent observations (i.e., multiple observations of the same relationship in a single sample) be # consolidated so that there is just one independent observation per sample? check_dependence = TRUE, # If check_dependence is TRUE, how should dependency be resolved? Options are to compute a composite # effect size (default), compute the average of the effect sizes, or to stop with an error message. collapse_method = c("composite", "average", "stop"), # The intercor argument uses the control_intercor() function to control how the intercorrelations # among variables are handled with collapse_method == "composite" intercor = control_intercor(), # Should artifact information be cleaned to resolve discrepancies among values recorded for multiple # relationsips involving a given construct in a given sample? clean_artifacts = TRUE, # Should missing artifact information be imputed? (For use with individual-correction method only) impute_artifacts = TRUE, # If impute_artifacts is TRUE, how should imputation be performed? See the documentation for the # control_psychmeta() function for descriptions of the available options. impute_method = c("bootstrap_mod", "bootstrap_full", "simulate_mod", "simulate_full", "wt_mean_mod", "wt_mean_full", "unwt_mean_mod", "unwt_mean_full", "replace_unity", "stop"), # What seed value should be set for the imputation process? (This makes the imputation reproducible) seed = 42, # Should artifact information from observations not included in meta-analyses be harvested from "data" # and included in artifact distributions? use_all_arts = TRUE, # For meta-analyses of d values, should the proportionality of membership in the unrestricted sample # be estimated from the range-restricted proportions and the range-restriction correction factor? estimate_pa = FALSE, # To what number of decimal places should interactive artifact distributions be rounded prior to use? # Rounding reduces the computational burden of creating multi-dimensional arrays of artifact information. decimals = 2, # Should the "Hunter-Schmidt" override settings be used? When TRUE, this will override settings for: # - wt_type will set to "sample_size" # - error_type will set to "mean" # - correct_bias will set to TRUE # - conf_method will set to "norm" # - cred_method will set to "norm" # - var_unbiased will set to FALSE # - residual_ads will be set to FALSE # - use_all_arts will set to FALSE hs_override = FALSE )
At a minimum, simply specify correlations and sample sizes. You can specify vectors containing correlations, sample sizes, and other input data for each study in three different ways (and any combination of the three):
ma_r(rxyi = data_r_meas_multi$rxyi, n = data_r_meas_multi$n)
ma_r(rxyi = "rxyi", n = "n", data = data_r_meas_multi)
ma_r(rxyi = rxyi, n = n, data = data_r_meas_multi)
Moderator analyses are easy—just identify a column or matrix of moderator information.
ma_r(rxyi = rxyi, n = n, moderators = moderator, data = data_r_meas_multi)
To use multiple moderators, supply the column names as a string vector.
ma_r(rxyi = rxyi, n = n, moderators = c("Rating source", "Published", "Complexity"), data = data_r_gonzalezmule_2014)
On top of moderators,
psychmeta can also meta-analyze multiple construct relationships in one function call.
ma_r(rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, data = data_r_meas_multi)
ma_r(rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, moderators = moderator, data = data_r_meas_multi)
psychmeta can automatically make corrections for reliability and range restriction as it computes meta-analyses. To do so, specify vectors of data for each artifact your want to correct. To apply corrections individually to each study, specify
ma_method = "ic".
impute_artifacts arguments are used to impute missing artifact data and check for various errors or inconsistencies in the artifacts. You should generally leave these set to
TRUE; they are only
FALSE here to speed up computations during this tutorial, as our data are clean.
ma_obj <- ma_r(ma_method = "ic", rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, rxx = rxxi, ryy = ryyi, moderators = moderator, clean_artifacts = FALSE, impute_artifacts = FALSE, data = data_r_meas_multi)
To apply corrections to meta-analysis results overall, rather than to individual studies, specify
ma_method = "ad". Artifact distribution methods are useful when you have missing data (they make imputating missing values unnecessary). They can also be used to supplement observed artifacts with values from other sources (reducing artifact sampling error) and for applying corrections to artifact values to yield more accurate meta-analysis results [e.g., @DahlkeNotrestrictedselection2019].
psychmeta can apply a variety of artifact correction models. You can specify which method to use with the
correction_method argument. The default is "auto", and
psychmeta will choose the best method based on available artifact data and settings for these arguments:
ma_r(ma_method = "ad", rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, rxx = rxxi, ryy = ryyi, correct_rr_x = FALSE, correct_rr_y = FALSE, moderators = moderator, data = data_r_meas_multi)
You can also convert a meta-analysis previously calculated using barebones or individual-correction methods to an artifact-distribution meta-analysis:
ma_r_ad(ma_obj, correct_rr_x = FALSE, correct_rr_y = FALSE)
Psychometric meta-analysis assumes that each effect size for a particular construct relationship comes from an independent sample. If one sample provides multiple effect sizes for the same relationship,
psychmeta can automatically consolidate them as either composites (recommended) or averages.
ma_r(rxyi = rxyi, n = n, sample_id = sample_id, collapse_method = "composite", data = data_r_meas_multi) ma_r(rxyi = rxyi, n = n, sample_id = sample_id, collapse_method = "average", data = data_r_meas_multi)
The results of
psychmeta's meta-analysis functions take the form of nested tibbles. A tibble is a type of tabular array that is a modernized version on the data frame (if you're new to tibbles, a vignette on the topic is available from the
tibble package's CRAN documentation). The "nesting" of results means that the some cells of the tibble contain complex objects, such as data frames and lists (i.e., objects that have their own indexing structures, hence the nesting). This nested structure allows a lot of information to be organized in a very concise way so that an overview of even a very complicated meta-analysis can fit in your console window.
As an example, below are some results computed using the
data_r_gonzalezmule_2014 data set included with
psychmeta, which is an updated version of the database reported by @Gonzalez-Mulemetaanalysisrelationshipgeneral2014 in their meta-analysis of the relationship between general mental ability (GMA) and organizational citizenship behaviors (i.e., prosocial behaviors exhibited in the workplace; OCBs). Gonzalez-Mulé et al. explored a number of moderators, but for this analysis we have chosen to use only two:
Rating source (OCBs rated by self vs. supervisor) and
Type (the target of the OCBs; OCB-I = targeted toward individuals, OCB-O = targeted toward the organization, OCB-CH = targeted toward change). To demonstrate the structure of meta-analysis objects, we have chosen to run this example using hierarchical moderator analyses (i.e.,
moderator_type = "hierarchical"), which means we will be able to see the main effects of each moderator as well as the results from individual combinations of the levels of the moderators.
Note: This table of results is larger than can fit in most HTML viewers. Use the arrow buttons in the header to scroll horizonally through the content and use the page numbers in the lower-right corner of the object to scroll vertically.
(gonzalezmule <- ma_r(ma_method = "ic", rxyi = rxyi, n = n, construct_x = "GMA", construct_y = "OCB", rxx = rxxi, ryy = ryyi, ux = ux, indirect_rr_x = TRUE, moderators = c("Rating source", "Type", "Published"), moderator_type = "hierarchical", control = control_psychmeta(hs_override = TRUE), data = data_r_gonzalezmule_2014))
From this tibble, we see that there are a total of 13 meta-analyses (each row of the tibble is a distinct analysis representing a unique set of construct pairings and/or combinations of moderator levels) and these meta-analyses are indexed by a unique number in the
analysis_id column. We also see that the
pair_id column indexes the individual construct pairs (in this example, there is only one contstruct pair: GMA and OCB), with the names of the constructs displayed in the
In terms of moderator analyses, the
analysis_type column tells us how moderator information was treated in each meta-analysis. Analyses labeled as "Overall" represent meta-analyses of all information, with all observations collapsed across moderator levels (when moderator levels are collapsed, it is indicated by "All Levels" in the columns bearing the moderator names). Analyses labeled as "Simple Moderator" represent meta-analyses of a particular level of a single moderator, collapsing across all of the other moderators (the moderator levels used in these analyses are indicated in the columns bearing the moderator names). Finally, analyses labled as "Fully Hierarchical Moderator" represent intersections of the categories of the moderators. Note that this database is smaller than what would be needed to do a good hierarchical moderator analysis and we have run this analysis for purposes of illustration only. When hierarchical moderator analyses are run with more than two moderators, results will also be diplayed for "Partial Hierarchical Moderator" analyses, which means that the results represent intersections among only a subset of the moderators while levels of the other moderators are collapsed.
To the right of the moderator columns are columns that contain data and results from the meta-analyses. The
meta_tables column contains the meta-analysis tables for each analysis, the
escalc column contains the data from each meta-analysis (these data are in the format of "escalc" class objects that can be used with the metafor package's functions), and the
ad column contains artifact distributions from psychometric meta-analyses (each cell contains artifact distributions from individual-correction analyses, if present, and from artifact-distribution analyses, if present).
To see a summary view of the statistical results of the meta-analysis, use the
To retrieve tables from all meta-analyses in the object, you can use the
get_metatab() function (this information is also included in the output object from the
summary() function). For example, below is the true-score correlation table from our meta-analysis of Gonzalez-Mulé et al.'s (2014) data. This table communicates the bare-bones and fully corrected results for all of the relationships in the meta-analysis object.
metatab_list <- get_metatab(gonzalezmule) metatab_list$individual_correction$true_score
Of course, it is also possible to directly pull the results of a particular analysis out of the meta-analysis object, but the tables retrieved with this method will not contain the
pair_id, construct, or moderator information included in the tables from
ma_r() will automatically compute the primary results that should be reported for meta-analyses, including total sample size (N), number of studies (k), average effect sizes (mean_r and mean_rho), observed (sd_r and sd_r_c) and residual (sd_res and sd_rho) standard deviations, and confidence intervals and credibility intervals. These results are sufficient for evaluating average effect sizes, heterogeneity of effects, and precision or uncertainty in the results.
psychmeta also provides a variety of follow-up analyses to supplement the primary meta-analysis results.
sensitivity() computes several different types of sensitivity analyses for meta-analytic results:
bootstrap(calculate bootstrap confidence intervals on all meta-analysis results)
cumulative(calculate cumulative meta-analyses—a form of publication bias analysis)
leave1out(compute meta-anlayses leaving out 1 sample at a time—for outlier/influential point detection)
You can either call all sensitivity analyses at once (
sensitivity()) or call each type of analysis individually (
ma_obj <- sensitivity(ma_obj, leave1out = TRUE, bootstrap = TRUE, cumulative = TRUE, sort_method = "weight", boot_iter = 100, boot_conf_level = 0.95, boot_ci_type = "norm")
metareg() computes meta-regression results using both categorical and continuous moderators via the
rma() function from the
ma_obj_gm <- ma_r(ma_method = "ic", rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi, moderators = c("Rating source", "Complexity"), data = data_r_gonzalezmule_2014) ma_obj_gm <- metareg(ma_obj_gm) get_metareg(ma_obj = ma_obj_gm)[]$individual_correction$true_score ma_obj_mc <- ma_r(ma_method = "ic", rxyi = r, n = N, moderators = c("Parent", "Age"), cat_moderators = c(TRUE, FALSE), data = data_r_mcleod_2007) ma_obj_mc <- metareg(ma_obj_mc, formula_list = list("Parent*Age" = yi ~ Parent*Age)) get_metareg(ma_obj_mc)[]$individual_correction$true_score
Note that the relevant sample size for meta-regression is the number of studies (k). Meta-regression results should be interpreted cautiously when k is small or when there is low variability in moderator levels across studies [@SchmidtMetaRegression2017]. In these and other cases, hierarhical subgroup moderator analyses can yield more reliable results:
ma_obj_gm <- ma_r(ma_method = "ic", rxyi = rxyi, n = n, rxx = rxxi, ryy = ryyi, moderators = c("Rating source", "Complexity"), moderator_type = "hierarchical", data = data_r_gonzalezmule_2014) ma_obj_gm
psychmeta automatically estimates residual standard deviations (SD~res~ and SD~$\rho$~) and credibility intervals. These statistics are the best way to examine heterogeneity in meta-anlayses [@WiernikEmpiricalBenchmarks2017]. The
heterogeneity() function computes additional heterogeneity statistics (e.g., % Variance Accounted For, Q, I^2^) that may be requested by some journals or reviewers; we don't recommend these statistics for regular use.
ma_obj <- heterogeneity(ma_obj)
psychmeta has several functions to access meta-analysis results and quickly prepare them for publication. The most basic way to access results is to print the
ma_obj results object:
ma_obj <- ma_r(ma_method = "ic", sample_id = sample_id, rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, rxx = rxxi, ryy = ryyi, moderators = moderator, clean_artifacts = FALSE, impute_artifacts = FALSE, data = data_r_meas_multi) ma_obj
You can limit printing to specific constructs and/or construct pairs using the
filter_ma(ma_obj, analyses = list(construct = "Y") ) filter_ma(ma_obj, analyses = list(construct_pair = list(c("X", "Z"))) )
Filtering on other analysis features, such as moderator levels or minimum N and/or k, is also possible.
psychmeta also includes a variety of
get_stuff() functions to retrive specific results from an
ma_obj object. For example:
# Heterogeneity statistics ma_obj <- heterogeneity(ma_obj) out_heterogeneity <- get_heterogeneity(ma_obj) out_heterogeneity$`analysis id: 1`$barebones # Sensitivity analyses ma_obj <- sensitivity(ma_obj, bootstrap = FALSE) out_leave1out <- get_leave1out(ma_obj) out_leave1out$`analysis id: 1`$barebones out_cumulative <- get_cumulative(ma_obj) out_cumulative$`analysis id: 1`$barebones # Meta-regression analyses ma_obj <- metareg(ma_obj) out_metareg <- get_metareg(ma_obj) out_metareg$`analysis id: 1`$barebones # A summary of artifact distributions get_ad(ma_obj, ma_method = "ic")
psychmeta can automatically produce publication-ready tables of meta-analysis
results that can be copied into your word processor or inserted into an RMarkdown
# Write meta-analysis tables to a Word file called "Meta-analysis table.docx" metabulate(ma_obj, file = "Meta-analysis table.docx", output_format = "word", output_dir = tempdir())
psychmeta can produce both funnel and forest plots for visually presenting meta-analytic results.
# Create funnel and forest plots ma_obj <- plot_funnel(ma_obj = ma_obj) ma_obj <- plot_forest(ma_obj = ma_obj, ma_facetname = "MA") # Extract plots for viewing out_plots <- get_plots(ma_obj) out_plots$funnel$`analysis id: 1`$barebones out_plots$forest$`analysis id: 1`$moderated$barebones # The sensitivity_cumulative() and sensitivity_leave1out() functions also produce plots. out_plots$leave1out$`analysis id: 1`$barebones$plots out_plots$cumulative$`analysis id: 1`$barebones$plots
ma_obj_print <- ma_r(ma_method = "ic", sample_id = sample_id, rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, rxx = rxxi, ryy = ryyi, moderators = moderator, clean_artifacts = FALSE, impute_artifacts = FALSE, data = data_r_meas_multi) ma_obj_print <- plot_funnel(ma_obj = ma_obj_print) ma_obj_print$funnel$`analysis id: 1`$barebones
psychmeta can automatically generate a formatted reference list using a BibTeX or Zotero library.
First, specify the
citekey argument when running
ma_r(). Then use the
ma_obj <- ma_r(ma_method = "ic", rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, rxx = rxxi, ryy = ryyi, moderators = moderator, clean_artifacts = FALSE, impute_artifacts = FALSE, citekey = citekey, data = data_r_meas_multi) generate_bib(ma_obj, # The location of your .bib file (normally should be in the same directory # as the analysis script). bib = system.file("templates/sample_bibliography.bib", package="psychmeta"), # Your citation style. Must be the style ID for a style hosted at # http://zotero.org/styles/ style = "apa", # What format to write output as (options are: "word", "html", "pdf", "text", # "Rmd", "biblatex", "citekeys")? output_format = "word", # What filename to output to (use "console" to print to the R console). file = "sources.docx" )
You can also generate a bibliography containing only sources contributing to a specific subset of the meta-analyses:
generate_bib(ma_obj, bib = system.file("templates/sample_bibliography.bib", package="psychmeta"), analyses = list(construct_pair = list(c("X", "Y"))) )
Even if no studies in your meta-analytic database provide artifact information, pre-specified artifact distributions from previous meta-analyses can still be used in
psychmeta! There are several ways to use pre-specified distributions, ranging from using them on their on their own to combining them with observed artifact information. Below are four potential methods, but
psychmeta allows many other ways to use artifact distributions.
create_ad()function to create artifact distribution objects, which can then be used with the
ma_d_ad()functions to correct bare-bones meta-analyses.
# Create artifact-distribution objects for X, Y, and Z. ad_x <- create_ad(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha") ad_y <- create_ad(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha") ad_z <- create_ad(mean_qxi = 0.8962108, var_qxi = 0.0007840593, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha") # Compute a meta-analysis of X and Y. ma_bb_xy <- ma_r_bb(r = rxyi, n = n, data = filter(data_r_meas_multi, x_name == "X", y_name == "Y")) # Correct the meta-analysis for measurement error. ma_r_ad(ma_obj = ma_bb_xy, ad_obj_x = ad_x, ad_obj_y = ad_y, correct_rr_x = FALSE, correct_rr_y = FALSE)
ma_r()function to automate your workflow! Just supply the artifact distributions as a list with the
ma_r(ma_method = "ad", rxyi = rxyi, n = n, correct_rr_x = FALSE, correct_rr_y = FALSE, construct_x = x_name, construct_y = y_name, sample_id = sample_id, clean_artifacts = FALSE, impute_artifacts = FALSE, moderators = moderator, data = data_r_meas_multi, # The "supplemental_ads" argument can take a list of artifact-distribution objects. supplemental_ads = list(X = ad_x, Y = ad_y, Z = ad_z))
psychmetado it for you. Instead of running
create_ad(), you can create a list of artifact information to use as the
ma_d(). This list contains artifact information for each construct you want to supply information for. This is a "list of lists" - each construct for which you provide information gets its own list labeled with the construct name. Each construct's list should contain entries that are named like the arguments to the
ma_r(ma_method = "ad", rxyi = rxyi, n = n, correct_rr_x = FALSE, correct_rr_y = FALSE, construct_x = x_name, construct_y = y_name, sample_id = sample_id, clean_artifacts = FALSE, impute_artifacts = FALSE, moderators = moderator, data = data_r_meas_multi, supplemental_ads = # The "supplemental_ads" argument can also take raw artifact values. # (These values are unrounded so our examples all produce identical results.) list(X = list(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha"), Y = list(mean_qxi = 0.8941266, var_qxi = 0.0009367234, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha"), Z = list(mean_qxi = 0.8962108, var_qxi = 0.0007840593, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")))
create_ad_list()function. This list can then be passed to
ad_list <- create_ad_list(n = n, rxx = rxxi, ryy = ryyi, construct_x = x_name, construct_y = y_name, sample_id = sample_id, data = data_r_meas_multi) ma_r(ma_method = "ad", rxyi = rxyi, n = n, correct_rr_x = FALSE, correct_rr_y = FALSE, construct_x = x_name, construct_y = y_name, sample_id = sample_id, clean_artifacts = FALSE, impute_artifacts = FALSE, moderators = moderator, data = data_r_meas_multi, # The "supplemental_ads" argument can take the output of create_ad_list(). supplemental_ads = ad_list)
# For purposes of illustration, delete all reliability values not associated with "Z": dat_zarts <- data_r_meas_multi dat_zarts[,"rxxi"] <- dat_zarts[dat_zarts$y_name != "Z","ryyi"] <- NA # Compute a meta-analysis using three different types of artifact formats: ma_r(ma_method = "ad", rxyi = rxyi, n = n, # Observed artifacts can be provided along with the "supplemental_ads" argument. ryy = ryyi, correct_rr_x = FALSE, correct_rr_y = FALSE, construct_x = x_name, construct_y = y_name, sample_id = sample_id, clean_artifacts = FALSE, impute_artifacts = FALSE, moderators = moderator, data = dat_zarts, supplemental_ads = list(X = ad_x, Y = list(mean_qxi = 0.8941266, var_qxi = 0.0009367234, k_qxi = 40, mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")))
In addition to its collection of tools for computing meta-analyses,
psychmeta also boasts a robust suite of simulation functions. These functions are intended to support high-quality meta-analytic simulations by correctly introducing sampling error, measurement error, and range restriction into a set of parameter values. These functions can be valuable for both research and pedagogical applications.
You can generate data impacted by psychometric artifacts for a single sample by using the
simulate_psych() function. This function reports three datasets per sample: Observed scores, true scores, and error scores. It can also perform selection and indicate (1) which cases would be selected on the basis of observed scores (i.e., actual selection), (2) which cases would be selected on the basis of true scores (i.e., error-free selection), and (3) which cases would be selected on the basis of error scores (i.e., random selection).
sim_dat <- simulate_psych(n = 1000, rho_mat = reshape_vec2mat(.5), sigma_vec = c(1, 1), sr_vec = c(1, .5), rel_vec = c(.8, .8), var_names = c("Y", "X"))
sim_dat cor(sim_dat$observed[,1:2]) cor(sim_dat$true[,1:2]) cor(sim_dat$error[,1:2])
You can automate the process of analyzing a simulated sample when you are interested in effect sizes and artifact estimates. For example, you can generate correlations impacted by psychometric artifacts for a single sample of data using the
simulate_r_sample() function. This function generates sample statistics when the
n argument is a finite value, but it can also analyze parameters when
n = Inf!
simulate_r_sample(n = 1000, # Correlation parameter matrix rho_mat = reshape_vec2mat(.5, order = 5), # Reliability parameter vector rel_vec = rep(.8, 5), # Selection ratio vector sr_vec = c(1, 1, 1, 1, .5), # Number of items in each scale k_items_vec = 1:5, # Matrix of weights to use in creating composites wt_mat = cbind(c(0, 0, 0, .3, 1), c(1, .3, 0, 0, 0)), # Selection ratios for composites sr_composites = c(.7, .5)) simulate_r_sample(n = Inf, rho_mat = reshape_vec2mat(.5, order = 5), rel_vec = rep(.8, 5), sr_vec = c(1, 1, 1, 1, .5), k_items_vec = 1:5, wt_mat = cbind(c(0, 0, 0, .3, 1), c(1, .3, 0, 0, 0)), sr_composites = c(.7, .5))
One of the most powerful simulation functions in
simulate_r_database(), which can generate entire databases of correlations for use in meta-analysis simulations. This function is a wrapper for
simulate_r_sample() and it can sample parameter values from whichever distributions you specify.
# Note the varying methods for defining parameters: simulate_r_database(k = 10, # Sample-size parameters # Parameters can be defined as functions. n_params = function(n) rgamma(n, shape = 100), # Correlation parameters (one parameter distribution per correlation) # Parameters can also be defined as vectors... rho_params = list(c(.1, .3, .5), # ...as a mean + a standard deviation... c(mean = .3, sd = .05), # ...or as a matrix of values and weights. rbind(value = c(.1, .3, .5), weight = c(1, 2, 1))), # Reliability parameters rel_params = list(c(.7, .8, .9), c(mean = .8, sd = .05), rbind(value = c(.7, .8, .9), weight = c(1, 2, 1))), # Selection-ratio parameters sr_params = list(1, 1, c(.5, .7)), # Measure-length parameters k_items_params = list(5, 8, 10), # Composite weight parameters wt_params = list(list(1, 1, 0)), # Selection-ratio parameters for composites sr_composite_params = list(1), # Variable names var_names = c("X", "Y", "Z"))
psychmeta includes d-value simulation functions that parallel the correlation simulation functions described above. These functions automate the simulation of subgroup comparisons and can introduce sampling error, measurement error, and range-restriction artifacts into d-value parameters.
To simulate a single sample of results, use the
simulate_d_sample() function. This function can generate data for any number of variables with any number of groups and it can form composite variables and perform selection.
## Simulate statistics by providing integers as "n_vec": simulate_d_sample(n_vec = c(200, 100), # List of rho matrices - one per group rho_mat_list = list(reshape_vec2mat(.5), reshape_vec2mat(.4)), # Matrix of group means (groups on rows, variables on columns) mu_mat = rbind(c(1, .5), c(0, 0)), # Matrix of group SDs (groups on rows, variables on columns) sigma_mat = rbind(c(1, 1), c(1, 1)), # Matrix of group reliabilities (groups on rows, variables on columns) rel_mat = rbind(c(.8, .7), c(.7, .7)), # Vector of selection ratios sr_vec = c(1, .5), # Number of items in each scale k_items_vec = c(5, 10), # Group names group_names = c("A", "B"))
simulate_d_sample() can also analyze parameters. To get parameter estimates, simply define the sample sizes as proportions (with the
n_vec argument) rather than as integers.
simulate_d_sample(n_vec = c(2/3, 1/3), rho_mat_list = list(reshape_vec2mat(.5), reshape_vec2mat(.4)), mu_mat = rbind(c(1, .5), c(0, 0)), sigma_mat = rbind(c(1, 1), c(1, 1)), rel_mat = rbind(c(.8, .7), c(.7, .7)), sr_vec = c(1, .5), k_items_vec = c(5, 10), group_names = c("A", "B"))
simulate_d_database() function is the d-value counterpart to
simulate_r_database(). However, each parameter argument now requires a set of parameters for each group (e.g., a distribution of correlations, means, reliabilities, etc. must be specified for each group).
simulate_d_database(k = 5, # "Group1" and "Group2" labels are not required for parameter arguments: # They are shown only for enhanced interpretability. # Sample-size parameter distributions n_params = list(Group1 = c(mean = 200, sd = 20), Group2 = c(mean = 100, sd = 20)), # Correlation parameter distributions rho_params = list(Group1 = list(c(.3, .4, .5)), Group2 = list(c(.3, .4, .5))), # Mean parameter distributions mu_params = list(Group1 = list(c(mean = .5, sd = .5), c(-.5, 0, .5)), Group2 = list(c(mean = 0, sd = .5), c(-.2, 0, .2))), # Reliability parameter distributions rel_params = list(Group1 = list(.8, .8), Group2 = list(c(.7, .8, .8), c(.7, .8, .8))), # Group names group_names = c("Group1", "Group2"), # Variable names var_names = c("Y", "Z"))
In addition to its array of univariate and bivariate range-restriction corrections,
psychmeta includes functions to correct data for multivariate range restriction. The
correct_matrix_mvrr() corrects covariance matrices for range restriction and the
correct_means_mvrr() function corrects vectors of means for range restriction.
# Create example matrices with different variances and covariances: Sigma_i <- reshape_vec2mat(cov = .2, var = .8, order = 4) Sigma_xx_a <- reshape_vec2mat(cov = .5, order = 2) # Correct "Sigma_i" for range restriction in variables 1 & 2: correct_matrix_mvrr(Sigma_i = Sigma_i, Sigma_xx_a = Sigma_xx_a, x_col = 1:2) # Correct the means of variables 1 & 2 for range restriction: correct_means_mvrr(Sigma = Sigma_i, means_i = c(2, 2, 1, 1), means_x_a = c(1, 1), x_col = 1:2)
The Spearman-Brown formula can be used to estimate the reliability of a measure that has been lengthened (or shortened) k times. This formula is implemented in the
estimate_rel_sb() function and its inverse (which estimates the number of times a measure needs to be lengthened to achieve a desired reliability) is implemented in the
estimate_rel_sb(rel_initial = .5, k = 4) estimate_length_sb(rel_initial = .5, rel_desired = .8)
composite_r_scalar() functions can compute composite correlations using matrix and scalar inputs, respectively. Both functions are capable of computing correlations between a composite and a single variable or between two composites.
# Correlation between a variable and a composite composite_r_matrix(r_mat = reshape_vec2mat(.4, order = 5), x_col = 2:5, y_col = 1) composite_r_scalar(mean_rxy = .4, k_vars_x = 4, mean_intercor_x = .4) # Correlation between two composites composite_r_matrix(r_mat = reshape_vec2mat(.3, order = 5), x_col = 1:3, y_col = 4:5) composite_r_scalar(mean_rxy = .3, k_vars_x = 3, mean_intercor_x = .3, k_vars_y = 2, mean_intercor_y = .3)
composite_d_scalar() functions can compute composite d values using matrix and scalar inputs, respectively.
composite_d_matrix(d_vec = c(1, 1), r_mat = reshape_vec2mat(.7), wt_vec = c(1, 1), p = .5) composite_d_scalar(mean_d = 1, mean_intercor = .7, k_vars = 2, p = .5)
composite_rel_scalar() functions can compute composite reliabilities (i.e., Mosier reliabilities) using matrix and scalar inputs, respectively.
composite_rel_matrix(rel_vec = c(.8, .8), r_mat = reshape_vec2mat(.4), sd_vec = c(1, 1)) composite_rel_scalar(mean_rel = .8, mean_intercor = .4, k_vars = 2)
lm_mat() function mimics the structure and appearance of the
lm() function from the
stats package, but uses correlation/covariance matrices to compute regression models instead of raw data sets.
# For purposes of demonstration, generate a dataset from the following matrix: S <- reshape_vec2mat(cov = c(.3 * 2 * 3, .4 * 2 * 4, .5 * 3 * 4), var = c(2, 3, 4)^2, var_names = c("X", "Y", "Z")) mean_vec <- setNames(c(1, 2, 3), colnames(S)) dat <- data.frame(MASS::mvrnorm(n = 100, mu = mean_vec, Sigma = S, empirical = TRUE))
The same formulas can be used with
lm_mat() as with
lm(), you just have to provide
n arguments (and an optional
mean_vec argument) instead of
lm_mat() can only compute moderated regressions if the product term is already in
# Compute regression models using one predictor: lm_out1 <- lm(formula = Y ~ X, data = dat) lm_mat_out1 <- lm_mat(formula = Y ~ X, cov_mat = S, mean_vec = mean_vec, n = nrow(dat)) # Compute regression models using two predictors: lm_out2 <- lm(formula = Y ~ X + Z, data = dat) lm_mat_out2 <- lm_mat(formula = Y ~ X + Z, cov_mat = S, mean_vec = mean_vec, n = nrow(dat))
summary() method works for both
lm_mat() results and the printouts look identical.
lm_mat() results can also be used with functions such as
confint() to yield output comparable to what one would get with
anova(lm_out1, lm_out2) anova(lm_mat_out1, lm_mat_out2)
get_matrix() function can extract correlation matrices from meta-analysis objects for use in subsequent multivariate analyses.
mat_array <- get_matrix(ma_obj = ma_obj) R <- mat_array$individual_correction$`moderator_comb: 1`$true_score$mean_rho n <- mat_array$individual_correction$`moderator_comb: 1`$true_score$N[1,3]
lm_mat() function can be used to compute regressions from meta-analytic matrices, using the mean N or harmonic mean N as a sample size.
lm_x <- lm_mat(formula = Y ~ X, cov_mat = R, n = n) lm_xz <- lm_mat(formula = Y ~ X + Z, cov_mat = R, n = n)
Meta-analytic incremental validity analyses are simple with the
anova() method applied to
summary(lm_x) summary(lm_xz) anova(lm_x, lm_xz)
Rscript as an appendix or online supplement.
psychmeta, please reference our in-press software announcement in Applied Psychological Measurement [@psychmetaAPM, see the bibliography below for reference information] or run
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.