In montilab/BS831: Boston University - Genomics Data Mining

This is a simple example, where we generate data from a given linear model (with known intercept and slope), and then we apply linear regression to estimate the parameters of the data generating model.

knitr::opts_chunk$set(message=FALSE, warning=FALSE)

set.seed(159) # for reproducible results 
nobs <- 1000   # sample size
beta0 <- 1     # true intercept
beta1 <- 0.15  # true slope
## simulate an imaginary dependent variable (e.g., age between 15-75)
X <- sample(15:75,nobs,replace=TRUE)
Y <- rnorm(nobs,mean=beta0 + beta1 * X,sd=1)

## or, equivalently
## Y <- beta0 + beta1 * X + rnorm(nobs,mean=0,sd=1)

## png(file.path(OMPATH,"Rmodules/figures/diffanalLM.png"))
par(mar=c(c(5, 4, 4, 5) + 0.1))
plot(X,Y,pch=20,xlab="age",ylab="expression")
abline(beta0,beta1,col="red",lty=3,lwd=2)

## notice the use of 'expression' to display mathematical symbols
##text(50,2,labels=expression(Y=beta[0]+beta[1]*X),las=1)