create_nlmr_summary: Creation of summarised mendelian randomisation local...

View source: R/mr_summarise.R

create_nlmr_summaryR Documentation

Creation of summarised mendelian randomisation local estimates

Description

create_nlmr_summary takes individual level data and creates summerised dataset, ready to save and share for summarised nlmr

Usage

create_nlmr_summary(
  y,
  x,
  g,
  covar = NULL,
  family = "gaussian",
  controlsonly = FALSE,
  q,
  prestrat = 1,
  strata_method = "ranked",
  strata_bound = c(0.2, 0.1, 0.8, 0.9),
  extra_statistics = FALSE,
  report_GR = FALSE,
  report_het = FALSE,
  seed = 1234
)

Arguments

y

vector of outcome values.

x

vector of exposure values.

g

the instrumental variable.

covar

a matrix of covariates.

family

a description of the error distribution and link function to be used in the model. This is a character string naming either the gaussian (i.e. "gaussian" for continuous outcome data) or binomial (i.e. "binomial" for binary outcome data) family function. "Coxph" can be used to fit survival data

  • in this case y must be a Surv object.

controlsonly

whether to estimate the gx association in all people, or in controls only. This is set to FALSE as default. It has no effect if family is set to "gaussian"

q

the number of quantiles the exposure distribution is to be split into. Within each quantile a causal effect will be fitted, known as a localised average causal effect (LACE). The default is deciles (i.e. 10 quantiles).

prestrat

the proportional size of pre-strata in the doubly-ranked method. If prestrat = 1 (default), then pre-strata will contain the number of individuals equal to the number of strata, and 1 individual from each pre-stratum is selected into each stratum. If prestrat = 10, then pre-strata contain 10 times the number of individuals as the number of strata, and 10 individuals from each pre-stratum are selected into each stratum. Larger pre-strata can improve the differentiation between pre-strata, although if pre-strata are too large such that the instrument values vary strongly within pre-strata, then the benefit of the doubly-ranked method is lost.

strata_method

what method to use for determining strata. By default this is set to "ranked", using Haodong Tian's double ranked version to calculate strata. The alternative is "residual" for determining the strata from the residual of the exposure regressed on the instrument (As in Statley and Burgess paper). The residual method relies on a constant relationship between the instrument and the exposure across the range of the exposure.

strata_bound

controls what range to use for the LACE estimates in graphs display. By default this is set to restricted, taking the 10th and 90th percentile of internal strata and the 20th and 80th for the bottom of the lowest strata and top of the highest strata. It is a vector taking the percentiles for the lowers bounds of the bottom and then other strata and then upper bounds of top and other strata. This only impacts the "max" and "min" values for the summary table This can be overridden in piecewise_summ_mr by using the xbreaks argument to hardset different breakpoints or replacing default with c(0,0,1,1) to return to true max and minimum

extra_statistics

This will add a second output reporting extra statistics for each strata. These include the true max and min of each strata (regardless of strata_bound setting) and the f statistic and p-value for the regressions

report_GR

This will add the Gelman-Rubin statistics for each strata to the output. Note this only works if strata_method="ranked".

report_het

This will add p-values for assessing the heterogeneity of the instrument - exposure relationship. The first column is the p-value of the Cochran Q heterogeneity test (Q); the second column is the p-value from the trend test (trend).

seed

The random seed to use when generating the quantiles (for reproducibility). If set to NA, the random seed will not be set.

Value

model the model specifications. The first column is the number of quantiles (q); the second column is the position used to relate x to the LACE in each quantiles (xpos); the third column is the type of confidence interval constructed (ci); the fourth column is the number of bootstrap replications performed (nboot).

Author(s)

Amy Mason, leaning heavily on work by James Statley and Matt Arnold


amymariemason/SUMnlmr documentation built on July 22, 2024, 10:03 a.m.