# lmExact: Create random values that deliver linear regressions with... In anspiess/reverseR: Linear Regression Stability to Significance Reversal

## Description

Takes self-supplied or generated random values and transforms these as to deliver linear regressions y = β_0 + β_1x + \varepsilon (with potential replicates) with either

1) exact slope β_1 and intercept β_0,
2) exact p-value and intercept β_0, or
3) exact R^2 and intercept β_0.

Intended for testing and education, not for cheating ! ;-)

## Usage

 1 2 lmExact(x = 1:20, ny = 1, intercept = 0, slope = 0.1, error = 0.1, seed = 123, pval = NULL, rsq = NULL, plot = TRUE, verbose = FALSE, ...) 

## Arguments

 x the predictor values. ny the number of replicate response values per predictor value. intercept the desired intercept β_0. slope the desired slope β_1. error if a single value, the standard deviation σ for sampling from a normal distribution, or a user-supplied vector of length x with random deviates. seed the random generator seed for reproducibility. pval the desired p-value of the slope. rsq the desired R^2. plot logical. If TRUE, the linear regression is plotted. verbose logical. If TRUE, a summary is printed to the console. ... other arguments to lm or plot.

## Details

For case 1), the error values are added to the exact (x_i, β_0 + β_1 x_i) values, the linear model y_i = β_0 + β_1 x_i + \varepsilon is fit, and the residuals y_i - \hat{y_i} are re-added to (x_i, β_0 + β_1 x_i).
For case 2), the same as in 1) is conducted, however the slope delivering the desired p-value is found by an optimizing algorithm.
Finally, for case 3), a QR reconstruction, rescaling and refitting is conducted, using the code found under 'References'.

## Value

A list with the following items:

 lm the linear model of class lm. x the predictor values. y the (random) response values. summary the model summary for quick checking of obtained parameters.

Using both x and y will give a linear regression with the desired parameter values when refitted.

## Author(s)

Andrej-Nikolai Spiess

## References

For method 3):
http://stats.stackexchange.com/questions/15011/generate-a-random-variable-with-a-defined-correlation-to-an-existing-variable.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ## No replicates, intercept = 3, slope = 0.2, sigma = 2, n = 20. res1 <- lmExact(x = 1:20, ny = 1, intercept = 3, slope = 2, error = 2) ## Same as above, but with 3 replicates, sigma = 1, n = 20. res2 <- lmExact(x = 1:20, ny = 3, intercept = 3, slope = 2, error = 1) ## No replicates, intercept = 2 and p-value = 0.025, sigma = 3, n = 50. ## => slope = 0.063 res3 <- lmExact(x = 1:50, ny = 1, intercept = 2, pval = 0.025, error = 3) ## 5 replicates, intercept = 1, R-square = 0.85, sigma = 2, n = 10. ## => slope = 0.117 res4 <- lmExact(x = 1:10, ny = 5, intercept = 1, rsq = 0.85, error = 2) ## Heteroscedastic (magnitude-dependent) noise. error <- sapply(1:20, function(x) rnorm(3, 0, x/10)) res5 <- lmExact(x = 1:20, ny = 3, intercept = 1, slope = 0.2, error = error) 

anspiess/reverseR documentation built on June 23, 2018, 2:22 a.m.