polymr | R Documentation |
This function approximates a non-linear causal effect through a polynomial regression of observational data, correcting for confounding using an instrumental variable-based approach.
polymr(
exposure,
outcome,
genotypes,
return_phenotypes_summary = TRUE,
return_observational_function = TRUE,
return_binned_observations = TRUE,
bins = 100,
starting_exposure_powers = 1:10,
max_exposure_power = max(starting_exposure_powers),
max_control_function_power = NULL,
power_step = 2,
reverse_t_thr = NULL,
p_thr_add = 0,
p_thr_drop = 1,
drop_higher_control_function_powers = TRUE
)
exposure |
A vector containing the exposure values for each individual. |
outcome |
A vector containing the outcome values for each individual. |
genotypes |
The NxM genetic matrix, with a column for each variant and a row for each individual. |
return_phenotypes_summary |
Whether to return a data.table containing the median, mean, and standard deviation of both exposure and outcome (default is TRUE). |
return_observational_function |
Whether to return a polynomial approximation of the observed association between exposure and outcome (default is TRUE). |
return_binned_observations |
Whether to return a data.table containing per-bin summary information, including the median exposure and the median, mean, and standard deviation of the outcome, binned on exposure (default is TRUE). |
bins |
Number of bins for which to return mean and median values (default is 100). |
starting_exposure_powers |
A vector containing the exponents for the exposure terms in the initial model. Default is c(1:10), corresponding to a 10th degree polynomial with all lower terms present. |
max_exposure_power |
The maximum exponent to use in modeling the
exposure (default is |
max_control_function_power |
The maximum exponent to use in modeling the control function component. Default is NULL, in which case the control function polynomial will include all terms from 1 to the highest degree of the exposure component. |
power_step |
The number by which to increment the degree of the exposure
polynomial each iteration until |
p_thr_add |
The p-value threshold determining if newly added exposure
terms should be considered significant enough to further increase the
degree of the polynomial (by |
p_thr_drop |
The p-value threshold determining which, if any, exposure terms should be dropped from the final function. This is done iteratively and the significance of each term is assessed in the new context before proceeding again, if necessary, until all remaining terms reach the defined significance threshold. Default is 1, which will retain all terms. A value of NULL will use a Bonferroni-corrected threshold at each step. |
drop_higher_control_function_powers |
Logical indicating whether control
function terms with a higher degree than the highest exposure term should
be dropped. Default is TRUE. Only relevant if |
reverse_t |
Threshold to use for reverse causality filtering (T
statistic), NULL for no filtering (default). A value of 0 represents a
simple filtering out of IVs explaining more variance in the outcome than
the exposure, whereas a value of 1.645 ( |
The polymr()
function estimates the causal effect of the
exposure on the outcome through polynomial regression, correcting for
confounding by including a polynomial of the control function. Full details
of the method can be found in the article (citation("PolyMR")
).
Returns a named list of results for PolyMR itself and the other selected values:
phenotypes_summary
is a data.table with the median, mean, and
standard deviation of both exposure and outcome
binned_observations
is a data.table with per-bin summary
information, including the median exposure and the median, mean, and
standard deviation of the outcome (binned on the exposure).
binned_observations_scaled
is a data.table with per-bin
summary information for the scaled exposure and outcome (which will be
used for modeling), including the median exposure and the median, mean,
and standard deviation of the outcome (binned on the exposure).
observational
is a list-like object of class EOModel
containing:
outcome_model
, an object of class lm
containing the
full model. Use summary()
for more details.
vcov
, the variance-covariance matrix which can be used to
create the 95
pval_null_model
, the p-value for the full model
(F-statistic-based).
pval_linear_model
, the LRT p-value comparing the full
model to the linear model.
r_squared
, the variance explained by the model.
polymr
is a list-like object of class PolyMRModel
, the
contents of which are similar to those of observational
:
outcome_model
, an object of class lm
containing the
full model. Use summary()
for more details.
vcov
, the variance-covariance matrix of the coefficient
estimates, which can be used to create the 95
for plotting.
pval_null_model
, the LRT p-value comparing the full model
to the model with (all) the control function terms but no exposure
terms.
pval_linear_model
, the LRT p-value comparing the full
model to the linear model containing all control function terms but
only the degree 1 (linear) exposure term.
r_squared
, the variance of the outcome attributable to the
causal effect of the exposure. This is obtained by comparing the
R-squared of the full model to that of the null model (containing
only the control function terms).
Both the exposure and outcome will be standardized (centered to have
mean 0 and scaled to have standard deviation 1) prior to modeling. The
returned coefficients correspond to these transformed phenotypes. New data
can be transformed to this scale using the values saved in
phenotypes_summary
.
simulated_data <- PolyMR:::new_PolyMRDataSim()
polymr(exposure = simulated_data$exposure,
outcome = simulated_data$outcome,
genotypes = simulated_data$genotypes,
reverse_t_thr = 0,
p_thr_drop = NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.