Description Usage Arguments Details Value Author(s) References Examples
Fits Weighted Quantile Sum Random Subset (WQSRS) regressions for continuous, binomial, multinomial and count outcomes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  gwqsrs(
formula,
data,
na.action,
weights,
mix_name,
stratified,
valid_var,
rs = 100,
n_vars = NULL,
b1_pos = TRUE,
b1_constr = FALSE,
zero_infl = FALSE,
q = 4,
validation = 0.6,
family = gaussian,
zilink = c("logit", "probit", "cloglog", "cauchit", "log"),
seed = NULL,
pred = 0,
plots = FALSE,
tables = FALSE,
plan_strategy = "sequential",
control = list(rho = 1, outer.iter = 400, inner.iter = 800, delta = 1e07, tol =
1e08, trace = 0)
)

formula 
An object of class 
data 
The 
na.action 

weights 
an optional vector of weights to be used in the fitting process.
Should be 
mix_name 
A character vector listing the variables contributing to a mixture effect. 
stratified 
The character name of the variable for which you want to stratify for.
It has to be a 
valid_var 
A character value containing the name of the variable that identifies the validation and the training dataset. You previously need to create a variable in the dataset which is equal to 1 for the observations you want to include in the validation dataset, equal to 0 for the observation you want to include in the training dataset (use 0 also for the validation dataset if you want to train and validate the model on the same data) and equal to 2 if you want to keep part of the data for the predictive model. 
rs 
Number of random subset samples used in parameter estimation. 
n_vars 
The number of mixture components to be included at each random subset step. 
b1_pos 
A logical value that determines whether weights are derived from models where the beta values were positive or negative. 
b1_constr 
A logial value that determines whether to apply positive (if 
zero_infl 
A logical value ( 
q 
An 
validation 
Percentage of the dataset to be used to validate the model. If

family 
A character value that allows to decide for the glm: 
zilink 
character specification of link function in the binary zeroinflation model (you can choose among "logit", "probit", "cloglog", "cauchit", "log"). 
seed 
An 
pred 
Percentage of the dataset to be used for the predictive model. If 
plots 
A logical value indicating whether plots should be generated with the output
( 
tables 
A logical value indicating whether tables should be generated in the output
( 
plan_strategy 
A character value that allows to choose the evaluation strategies for the

control 
The control list of optimization parameters. See 
gWQS
uses the glm
function in the stats package to fit the linear, logistic,
the Poisson and the quasiPoisson regression, while the glm.nb
function from the MASS
package is used to fit the negative binomial regression respectively. The nlm
function from
the stats package was used to optimize the loglikelihood of the multinomial regression.
The solnp
optimization function is used to estimate the weights in each
random subset sample.
The seed
argument specifies a fixed seed through the set.seed
function.
The plots
argument produces three figures (two if family = binomial
or "multinomial"
)
through the ggplot
function. One more plot will be printed if pred > 0
and
family = binomial
.
The tables
argument produces two tables in the viewr pane through the use of the functions
kable
and kable_styling
.
gwqsrs
return the results of the WQSRS regression as well as many other objects and datasets.
fit 
The object that summarizes the output of the WQSRS model, reflecting a
linear, logistic, multinomial, Poisson, quasiPoisson or negative binmial regression
depending on how the 
conv 
Indicates whether the solver has converged (0) or not (1 or 2). 
bres 
Matrix of estimated weights, mixture effect parameter estimates and the associated standard errors, statistics and pvalues estimated for each bootstrap iteration. 
wqs 
Vector containing the wqs index for each subject. 
q_i 
List of the cutoffs used to divide in quantiles the variables in the mixture 
slctd_vars 
List of vectors containing the names of the mixture components selected at each random subset step. 
tindex 
Vector containing the rows used to estimate the weights in each random subset. 
vindex 
Vector containing the rows used to estimate the parameters of the final model. 
final_weights 

y_wqs_df 

df_pred 

pindex 
Vector containing the subjects used for prediction. It is generated only when 
Stefano Renzetti, Paul Curtin, Chris Gennings
Paul Curtin, Joshua Kellogg, Nadja Cech & Chris Gennings (2019): A random subset implementation
of weighted quantile sum (WQSRS) regression for analysis of highdimensional mixtures,
Communications in Statistics  Simulation and Computation, DOI: 10.1080/03610918.2019.1577971.
https://doi.org/10.1080/03610918.2019.1577971.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27  # we save the names of the mixture variables in the variable "toxic_chems"
toxic_chems = c("log_LBX074LA", "log_LBX099LA", "log_LBX105LA", "log_LBX118LA",
"log_LBX138LA", "log_LBX153LA", "log_LBX156LA", "log_LBX157LA", "log_LBX167LA",
"log_LBX170LA", "log_LBX180LA", "log_LBX187LA", "log_LBX189LA", "log_LBX194LA",
"log_LBX196LA", "log_LBX199LA", "log_LBXD01LA", "log_LBXD02LA", "log_LBXD03LA",
"log_LBXD04LA", "log_LBXD05LA", "log_LBXD07LA", "log_LBXF01LA", "log_LBXF02LA",
"log_LBXF03LA", "log_LBXF04LA", "log_LBXF05LA", "log_LBXF06LA", "log_LBXF07LA",
"log_LBXF08LA", "log_LBXF09LA", "log_LBXPCBLA", "log_LBXTCDLA", "log_LBXHXCLA")
# To run a linear model and save the results in the variable "results". This linear model
# (family="gaussian") will rank/standardize variables in quartiles (q = 4), perform a
# 40/60 split of the data for training/validation (validation = 0.6), and estimate weights
# over 10 random subset samples (rs = 10; in practical applications at least 1000 random
# subsets should be used). The number of chemicals to be included at each random subset
# step is left to the function which automatically chooses the rounded square root of the
# toxic_chems vector's length (n_vars = NULL). Weights will be derived from mixture effect
# parameters that are positive (b1_pos = TRUE). A unique seed was specified (seed = 2016)
# so this model will be reproducible, and plots describing the variable weights and linear
# relationship will be generated as output (plots = TRUE). In the end tables describing the
# weights values and the model parameters with the respectively statistics are generated in
# the plots window
results = gwqsrs(y ~ wqs, mix_name = toxic_chems, data = wqs_data, q = 4,
validation = 0.6, rs = 10, n_vars = NULL, b1_pos = TRUE, b1_constr = FALSE,
family = gaussian, seed = 2018, plots = TRUE, tables = TRUE)
# to test the significance of the covariates
summary(results$fit)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.