View source: R/makeScalesRegression.R
| makeScalesRegression | R Documentation |
Generates synthetic rating-scale data that replicates reported regression results. This function is useful for reproducing analyses from published research where only summary statistics (standardised regression coefficients and R-squared) are reported.
makeScalesRegression(
n,
beta_std,
r_squared,
iv_cormatrix = NULL,
iv_cor_mean = 0.3,
iv_cor_variance = 0.01,
iv_cor_range = c(-0.7, 0.7),
iv_means,
iv_sds,
dv_mean,
dv_sd,
lowerbound_iv,
upperbound_iv,
lowerbound_dv,
upperbound_dv,
items_iv = 1,
items_dv = 1,
var_names = NULL,
tolerance = 0.005
)
n |
Integer. Sample size |
beta_std |
Numeric vector of standardised regression coefficients (length k) |
r_squared |
Numeric. R-squared from regression (-1 to 1) |
iv_cormatrix |
k x k correlation matrix of independent variables. If missing (NULL), will be optimised. |
iv_cor_mean |
Numeric. Mean correlation among IVs when optimising (ignored if iv_cormatrix provided). Default = 0.3 |
iv_cor_variance |
Numeric. Variance of correlations when optimising (ignored if iv_cormatrix provided). Default = 0.01 |
iv_cor_range |
Numeric vector of length 2. Min and max constraints on correlations when optimising. Default = c(-0.7, 0.7) |
iv_means |
Numeric vector of means for IVs (length k) |
iv_sds |
Numeric vector of standard deviations for IVs (length k) |
dv_mean |
Numeric. Mean of dependent variable |
dv_sd |
Numeric. Standard deviation of dependent variable |
lowerbound_iv |
Numeric vector of lower bounds for each IV scale (or single value for all) |
upperbound_iv |
Numeric vector of upper bounds for each IV scale (or single value for all) |
lowerbound_dv |
Numeric. Lower bound for DV scale |
upperbound_dv |
Numeric. Upper bound for DV scale |
items_iv |
Integer vector of number of items per IV scale (or single value for all). Default = 1 |
items_dv |
Integer. Number of items in DV scale. Default = 1 |
var_names |
Character vector of variable names (length k+1: IVs then DV) |
tolerance |
Numeric. Acceptable deviation from target R-squared (default 0.005) |
Generate regression data from summary statistics
The function can operate in two modes:
Mode 1: With IV correlation matrix provided
When iv_cormatrix is provided, the function uses the given
correlation structure among independent variables and calculates the
implied IV-DV correlations from the regression coefficients.
Mode 2: With optimisation (IV correlation matrix not provided)
When iv_cormatrix = NULL, the function optimises to find a plausible
correlation structure among independent variables that matches the reported
regression statistics.
Initial correlations are sampled using Fisher's z-transformation to ensure
proper distribution, then iteratively adjusted to match the target
R-squared.
The function generates Likert-scale data (not individual items)
using lfast() for each variable with specified moments, then
correlates them using lcor().
Generated data are verified by running a regression and comparing achieved
statistics with targets.
A list containing:
data |
Generated dataframe with k IVs and 1 DV |
target_stats |
List of target statistics provided |
achieved_stats |
List of achieved statistics from generated data |
diagnostics |
Comparison of target vs achieved |
iv_dv_cors |
Calculated correlations between IVs and DV |
full_cormatrix |
The complete (k+1) x (k+1) correlation matrix used |
optimisation_info |
If IV correlations were optimised, details about the optimisation |
lfast for generating individual rating-scale vectors with exact moments.
lcor for rearranging values to achieve target correlations.
makeCorrAlpha for generating correlation matrices from Cronbach's Alpha.
# Example 1: With provided IV correlation matrix
set.seed(123)
iv_corr <- matrix(c(1.0, 0.3, 0.3, 1.0), nrow = 2)
result1 <- makeScalesRegression(
n = 64,
beta_std = c(0.4, 0.3),
r_squared = 0.35,
iv_cormatrix = iv_corr,
iv_means = c(3.0, 3.5),
iv_sds = c(1.0, 0.9),
dv_mean = 3.8,
dv_sd = 1.1,
lowerbound_iv = 1,
upperbound_iv = 5,
lowerbound_dv = 1,
upperbound_dv = 5,
items_iv = 4,
items_dv = 4,
var_names = c("Attitude", "Intention", "Behaviour")
)
print(result1)
head(result1$data)
# Example 2: With optimisation (no IV correlation matrix)
set.seed(456)
result2 <- makeScalesRegression(
n = 128,
beta_std = c(0.3, 0.25, 0.2),
r_squared = 0.40,
iv_cormatrix = NULL, # Will be optimised
iv_cor_mean = 0.3,
iv_cor_variance = 0.02,
iv_means = c(3.0, 3.2, 2.8),
iv_sds = c(1.0, 0.9, 1.1),
dv_mean = 3.5,
dv_sd = 1.0,
lowerbound_iv = 1,
upperbound_iv = 5,
lowerbound_dv = 1,
upperbound_dv = 5,
items_iv = 4,
items_dv = 5
)
# View optimised correlation matrix
print(result2$target_stats$iv_cormatrix)
print(result2$optimisation_info)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.