R/makeCorrAlpha.R
In LikertMakeR: Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

Documented in makeCorrAlpha

#' Correlation matrix from Cronbach's Alpha
#'
#' @name makeCorrAlpha
#'
#' @description \code{makeCorrAlpha()} generates a random correlation
#'  matrix of given dimensions and predefined _Cronbach's Alpha_.
#'
#' Such a correlation matrix can be applied to the \code{makeItems()}
#' function to generate synthetic data with the predefined alpha.
#'
#' @param items (positive, int) matrix dimensions:
#'  number of rows & columns to generate
#' @param alpha (real) target Cronbach's Alpha
#'  (usually positive, must be between about -0.3 and +1)
#' @param variance (positive, real) Default = 0.5.
#'  User-provided standard deviation of values sampled from a
#'  normally-distributed log transformation.
#' @param precision (positive, real) Default = 0.
#'  User-defined value ranging from '0' to '3' to add some random variation
#'  around the target _Cronbach's Alpha_.
#'  '0' gives an exact alpha (to two decimal places)
#'
#'
#' @importFrom stats rnorm
#'
#' @return a correlation matrix
#'
#' @export makeCorrAlpha
#'
#' @note
#' Random values generated by \code{makeCorrAlpha()} are highly volatile.
#'  \code{makeCorrAlpha()} may not generate a feasible (positive-definite)
#'  correlation matrix, especially when
#'
#'  * variance is high relative to
#'      * desired Alpha, and
#'      * desired correlation dimensions
#'
#'  \code{makeCorrAlpha()} will inform the user if the resulting correlation
#'  matrix is positive definite, or not.
#'
#'  If the returned correlation matrix is not positive-definite,
#'  a feasible solution may still be possible. The user is encouraged to
#'  try again, possibly several times, to find one.
#'
#' @examples
#'
#' # define parameters
#' items <- 4
#' alpha <- 0.85
#' variance <- 0.5
#'
#' # apply function
#' set.seed(42)
#' cor_matrix <- makeCorrAlpha(items = items, alpha = alpha, variance = variance)
#'
#' # test function output
#' print(cor_matrix)
#' alpha(cor_matrix)
#' eigenvalues(cor_matrix, 1)
#'
#' # higher alpha, more items
#' cor_matrix2 <- makeCorrAlpha(items = 8, alpha = 0.95)
#'
#' # test output
#' cor_matrix2 |> round(2)
#' alpha(cor_matrix2) |> round(3)
#' eigenvalues(cor_matrix2, 1) |> round(3)
#'
#'
#' # large random variation around alpha
#' set.seed(42)
#' cor_matrix3 <- makeCorrAlpha(items = 6, alpha = 0.85, precision = 2)
#'
#' # test output
#' cor_matrix3 |> round(2)
#' alpha(cor_matrix3) |> round(3)
#' eigenvalues(cor_matrix3, 1) |> round(3)
#'
makeCorrAlpha <- function(items, alpha, variance = 0.5, precision = 0) {
  ####
  ###  Helper functions

  ###
  ### log_transform function
  ###
  ## logarithmic transformation of 'x' maps the desired correlation coefficient
  ## (ranging from -1 to +1) to a real number potentially
  ## ranging from -infinity to +infinity
  ## In practice, 'y' typically will range from -3 to +3.
  ## formula: y = log((1 + x) / (1 - x))
  log_transform <- function(x) {
    result <- log((1 + x) / (1 - x))
    # return(result)
  } ## end log_transform function

  ###
  ### exp_transform function
  ###
  ## The inverse of the logarithmic transformation maps the real number to a
  ## value ranging from -1 to +1.
  ## formula: x = (e^y - 1) / (e^y + 1)
  exp_transform <- function(y) {
    x <- (exp(y) - 1) / (exp(y) + 1)
    # return(x)
  } ## END exp_transform function


  ###
  ### make_corMatrix function
  ## create a target correlation matrix from a vector of correlation values
  make_corMatrix <- function(random_cors) {
    # Create an empty correlation matrix
    lower_matrix <- matrix(0, nrow = k, ncol = k)
    # Fill the lower and upper triangles with the random values
    lower_matrix[lower.tri(lower_matrix)] <- random_cors
    upper_matrix <- t(lower_matrix)
    # transform linear values into correlation coefficients
    cor_matrix <- lower_matrix + upper_matrix
    # Set diagonal elements to 1
    diag(cor_matrix) <- 1

    return(cor_matrix)
  } ## END make_corMatrix function

  ###
  ### improve_cor_matrix Function
  ## rearrange correlation values to find a possible positive definite matrix
  improve_cor_matrix <- function() {
    n_swap <- length(random_cors)
    swaps <- c(1:n_swap)
    swaps <- sample(swaps, n_swap, replace = FALSE)

    swap_candidates <- expand.grid(r1 = swaps, r2 = swaps)
    swap_candidates <- swap_candidates[swap_candidates[, 1] != swap_candidates[, 2], ]

    ## delete this line!
    # swap_candidates <- swap_candidates[order(nrow(swap_candidates):1),]

    current_vector <- random_cors
    n_swap_candidates <- nrow(swap_candidates)

    ## define values
    best_eigen_values <- eigen(cor_matrix)$values
    best_matrix <- cor_matrix

    ###
    ## swap values in the correlation matrix loop
    for (r in 1:n_swap_candidates) {
      ## locate data points to switch
      i <- swap_candidates[r, 1]
      j <- swap_candidates[r, 2]

      ## skip if values in two locations are same
      if (current_vector[i] == current_vector[j]) {
        next
      }

      ## record values in case they need to be put back
      ii <- current_vector[i]
      jj <- current_vector[j]

      ## swap the values in selected locations
      current_vector[i] <- jj
      current_vector[j] <- ii

      ## generate a square matrix from the correlation values vector
      temp_matrix <- make_corMatrix(current_vector)

      ## test for positive-definite correlation matrix
      eigen_values <- eigen(temp_matrix)$values

      ## keep only rearranged values that improve fit
      if (min(eigen_values) > min(best_eigen_values)) {
        best_eigen_values <- eigen_values
        best_matrix <- temp_matrix
        cat(paste0("improved at swap - ", r, "\n"))
      } else {
        ## return the values in selected locations
        current_vector[i] <- ii
        current_vector[j] <- jj
      }

      # is_positive_definite <- min(best_eigen_values) >= 0

      if (min(best_eigen_values) >= 0) {
        cat(paste0("stopped at swap - ", r, "\n"))
        break
      }
    } ## end swap values in the correlation matrix loop

    return(best_matrix)
  } ### end improve_cor_matrix Function
  ### end helper functions

  k <- items

  if (precision < 0) {
    precision <- 0
  }
  if (precision > 3) {
    precision <- 3
  }


  ## Calculate the mean correlation coefficient from alpha

  target_mean_r <- alpha / (k - alpha * (k - 1))

  ## add some random variation to the target mean correlation
  logr_sd_coefficient <- 2^(2^(4 / (precision + 1)))
  logr_sd <- 1 / logr_sd_coefficient

  log_transformed_r <- log_transform((target_mean_r) + rnorm(1, 0, logr_sd))
  mean_r <- exp_transform(log_transformed_r)

  ###  set up for correlation values search
  ## translate correlation (-1 to +1) into continuous variable (-inf to +inf)
  # log_transformed_r <- log_transform(mean_r)
  tolerance <- 1e-5

  current_mean_cors <- 0

  ## Adjust the correlation matrix
  max_iterations <- k^4 * 1000

  ## find a matrix with means close to desired values
  for (iteration in 1:max_iterations) {
    ## Generate random values with a mean of log_transformed_r
    ## and specified variance
    random_values <- rnorm(k * (k - 1) / 2, mean = log_transformed_r, sd = variance)
    ## translate back to correlation coefficients
    random_cors <- exp_transform(random_values)
    ## check if mean is consistent with desired alpha
    new_mean_cors <- mean(random_cors)
    ## update means of the log-transformed correlation values
    if ((abs(new_mean_cors - mean_r)) < (abs(current_mean_cors - mean_r))) {
      current_mean_cors <- new_mean_cors
    }
    # Check if the current mean is close to the target mean
    if (abs(current_mean_cors - mean_r) < tolerance) {
      cat(paste0(
        "correlation values consistent with desired alpha in ",
        iteration, " iterations\n"
      ))
      break
    }
  } ## END find means loop


  ##
  ## help make matrix positive-definite?
  random_cors <- sort(random_cors, decreasing = FALSE)

  ## correlaton values into correlation matrix
  cor_matrix <- make_corMatrix(random_cors)

  ## test for positive-definite correlation matrix
  eigen_values <- eigen(cor_matrix)$values
  is_positive_definite <- min(eigen_values) >= 0
  ##
  ## If not positive definite then reorder correlations
  if (is_positive_definite == FALSE) {
    cat("Correlation matrix is not yet positive definite
        \nWorking on it\n\n")
    cor_matrix <- improve_cor_matrix()
  }

  ####
  ## report positive definite status
  if (min(eigen(cor_matrix)$values) >= 0) {
    cat("The correlation matrix is positive definite\n\n")
  } else {
    cat("Correlation matrix is NOT positive definite
        \nTry running again (or reduce Variance parameter)\n")
  }

  return(cor_matrix)
} ## end matrixFromAlpha function

Any scripts or data that you put into this service are public.

LikertMakeR documentation built on June 8, 2025, 9:39 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

LikertMakeR
Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

R/makeCorrAlpha.R
In LikertMakeR: Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

Defines functions makeCorrAlpha

Documented in makeCorrAlpha

Try the LikertMakeR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

LikertMakeR Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

R/makeCorrAlpha.R In LikertMakeR: Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

Defines functions makeCorrAlpha

Documented in makeCorrAlpha

Try the LikertMakeR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

LikertMakeR
Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics

R/makeCorrAlpha.R
In LikertMakeR: Synthesise and Correlate Likert Scale and Rating-Scale Data Based on Summary Statistics