cor_data: Correlation and Data Generation

View source: R/cor_data.R

cor_dataR Documentation

Correlation and Data Generation

Description

Generates a data set based on x and y for a given target correlation r according to stats::cor(). The algorithm modifies the order of the y's, therefore is guaranteed that the (marginal) distribution of x and y will not be modified. Please note that it is not guaranteed that the final correlation will be the desired correlation; the algorithm interactively modifies the order. If you are unsatisfied with the result, it might help to increase maxit.

Usage

cor_data(
  x,
  y,
  r,
  method = c("pearson", "kendall", "spearman"),
  ...,
  maxit = 1000
)

dcorr(x, y, r, method = c("pearson", "kendall", "spearman"), ..., maxit = 1000)

Arguments

x

numeric: given x values

y

numeric: given y values

r

numeric: desired correlation

method

character: indicates which correlation coefficient is to be computed (default: '"pearson")

...

further parameters given to stats::cor()

maxit

numeric: maximal number of iterations (default: 1000)

Value

A matrix with two columns and an attribute interim for intermediate values as matrix. The rows of the matrix contain:

  • if method=="pearson": x_i, y_i, x_i-bar{x}, y_i-\bar{y}, (x_i-bar{x})^2, (y_i-\bar{y})^2, and (x_i-bar{x})((y_i-\bar{y}).

  • if method=="kendall":

    • x_i: The original x values.

    • y_i: The original y values.

    • p_i: The number of concordant pairs.

    • q_i: The number of discordant pairs.

  • if method=="spearman": x_i, y_i, p_i (concordant pairs), and q_i (disconcordant pairs). In a final step a vector with the row sums is appended as further column.

Examples

x <- runif(6)
y <- runif(6)
xy <- cor_data(x, y, r=0.6)
cbind(x, y, xy)

exams.forge documentation built on Sept. 11, 2024, 5:32 p.m.