View source: R/D_regularized.R
| D_regularized | R Documentation |
Multivariate group difference estimation with regularized binomial regression
D_regularized(
data,
mv.vars,
group.var,
group.values,
alpha = 0.5,
nfolds = 10,
s = "lambda.min",
type.measure = "deviance",
rename.output = TRUE,
out = FALSE,
size = NULL,
fold = FALSE,
fold.var = NULL,
pcc = FALSE,
auc = FALSE,
pred.prob = FALSE,
prob.cutoffs = seq(0, 1, 0.2),
append.data = FALSE
)
data |
A data frame or list containing two data frames (regularization and estimation data, in that order). |
mv.vars |
Character vector. Variable names in the multivariate variable set. |
group.var |
The name of the group variable. |
group.values |
Vector of length 2, group values (e.g. c("male", "female") or c(0,1)). |
alpha |
Alpha-value for penalizing function ranging from 0 to 1: 0 = ridge regression, 1 = lasso, 0.5 = elastic net (default). |
nfolds |
Number of folds used for obtaining lambda (range from 3 to n-1, default 10). |
s |
Which lambda value is used for predicted values? Either "lambda.min" (default) or "lambda.1se". |
type.measure |
Which measure is used during cross-validation. Default "deviance". |
rename.output |
Logical. Should the output values be renamed according to the group.values? Default TRUE. |
out |
Logical. Should results and predictions be calculated on out-of-bag data set? (Default FALSE) |
size |
Integer. Number of cases in regularization data per each group. Default 1/4 of cases. |
fold |
Logical. Is regularization applied across sample folds with separate predictions for each fold? (Default FALSE, see details) |
fold.var |
Character string. Name of the fold variable. (default NULL) |
pcc |
Logical. Include probabilities of correct classification? Default FALSE. |
auc |
Logical. Include area under the receiver operating characteristics? Default FALSE. |
pred.prob |
Logical. Include table of predicted probabilities? Default FALSE. |
prob.cutoffs |
Vector. Cutoffs for table of predicted probabilities. Default seq(0,1,0.20). |
append.data |
Logical. If TRUE, the data is appended to the predicted variables. |
fold = TRUE will apply manually defined data folds (supplied with fold.var) for regularization
and obtain estimates for each separately. This can be a good solution, for example, when the data are clustered
within countries. In such case, the cross-validation procedure is applied across countries.
out = TRUE will use separate data partition for regularization and estimation. That is, the first
cross-validation procedure is applied within the regularization set and the weights obtained are
then used in the estimation data partition. The size of regularization set is defined with size.
When used with fold = TRUE, size means size within a fold."
For more details on these options, please refer to the vignette and README of the multid package.
D |
Multivariate descriptive statistics and differences. |
pred.dat |
A data.frame with predicted values. |
cv.mod |
Regularized regression model from cv.glmnet. |
P.table |
Table of predicted probabilities by cutoffs. |
Lönnqvist, J. E., & Ilmarinen, V. J. (2021). Using a continuous measure of genderedness to assess sex differences in the attitudes of the political elite. Political Behavior, 43, 1779–1800. \Sexpr[results=rd]{tools:::Rd_expr_doi("https://doi.org/10.1007/s11109-021-09681-2")}
Ilmarinen, V. J., Vainikainen, M. P., & Lönnqvist, J. E. (2023). Is there a g-factor of genderedness? Using a continuous measure of genderedness to assess sex differences in personality, values, cognitive ability, school grades, and educational track. European Journal of Personality, 37, 313-337. \Sexpr[results=rd]{tools:::Rd_expr_doi("https://doi.org/10.1177/08902070221088155")}
cv.glmnet
D_regularized(
data = iris[iris$Species == "setosa" | iris$Species == "versicolor", ],
mv.vars = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
group.var = "Species", group.values = c("setosa", "versicolor")
)$D
# out-of-bag predictions
D_regularized(
data = iris[iris$Species == "setosa" | iris$Species == "versicolor", ],
mv.vars = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
group.var = "Species", group.values = c("setosa", "versicolor"),
out = TRUE, size = 15, pcc = TRUE, auc = TRUE
)$D
# separate sample folds
# generate data for 10 groups
set.seed(34246)
n1 <- 100
n2 <- 10
d <-
data.frame(
sex = sample(c("male", "female"), n1 * n2, replace = TRUE),
fold = sample(x = LETTERS[1:n2], size = n1 * n2, replace = TRUE),
x1 = rnorm(n1 * n2),
x2 = rnorm(n1 * n2),
x3 = rnorm(n1 * n2)
)
# Fit and predict with same data
D_regularized(
data = d,
mv.vars = c("x1", "x2", "x3"),
group.var = "sex",
group.values = c("female", "male"),
fold.var = "fold",
fold = TRUE,
rename.output = TRUE
)$D
# Out-of-bag data for each fold
D_regularized(
data = d,
mv.vars = c("x1", "x2", "x3"),
group.var = "sex",
group.values = c("female", "male"),
fold.var = "fold",
size = 17,
out = TRUE,
fold = TRUE,
rename.output = TRUE
)$D
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.