Description Usage Arguments Details Value Author(s) Examples
Generate a data frame with random normal values sampled according to group-specific parameters. Columns correspond to individuals belonging to different groups characterized by specific means and/or standard deviations. Rows correspond to features.
1 | rnormPerGroup(n, mean, sd, nrow)
|
n |
A vector indicating the number of columns per group. |
mean |
A vector indicating the mean per group. Must have the same length as n. |
sd |
A vector indicating the standard deviation per group. Must have the same length as n. |
nrow |
Number of rows (features) of the result data frame. |
First version: 2015-04 Last modification: 2015-04
A list with the following objects:
Data frame with the random numbers
Vector with the class label of each column.
Data frame with feature-wise means (rows) for each group (column).
Data frame with feature-wise standard deviation (rows) for each group (column).
Vector with the expected means per column.
Vector with the means per column in the result matrix.
Vector with the expected sds per column.
Vector with the sds per column in the result matrix.
A data frame of random normal values sampled with group-specific parameters.
Jacques van Helden (Jacques.van-Helden@univ-amu.fr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ################################################################
## Small test: generate a matrix, composed of three groups of different sizes, means and sds.
small.rnorm <- rnormPerGroup(n=c(6,4,5), mean=c(-3,0,5), sd=c(2,1,4), nrow=10)
## Check column means
plot(small.rnorm$exp.mean.per.col, small.rnorm$mean.per.col)
abline(a=0,b=1)
## Check column sd
plot(small.rnorm$exp.sd.per.col, small.rnorm$sd.per.col)
abline(a=0,b=1)
## Generate a wider matrix with
rnorm.result <- rnormPerGroup(n=c(100,100), mean=c(0, 0.5), sd=c(1,2), nrow=1000)
## Check the means per column
boxplot(rnorm.result$mean.per.col ~ rnorm.result$cl)
## Check the sd per column
boxplot(rnorm.result$sd.per.col ~ rnorm.result$cl, main="SD per group")
## Check means per group
boxplot(rnorm.result$mean.per.group, main="Feature-wise mean per group")
################################################################
## Run Student test on each feature to check the power.
## We chose equal sd to comply with the homoscedaticity assumption.
rnorm.result <- rnormPerGroup(n=c(50,50), mean=c(0, 0.5), sd=c(1,1), nrow=10000)
x.student <- tTestPerRow(x = rnorm.result$x, cl = rnorm.result$cl, var.equal=TRUE)
## Plot histogram of the observed differences between groups
hist(x.student$table$means.diff, breaks=100, main="Effect size distribution", xlab="Effect size")
grid(lty="solid",col="#BBBBBB")
abline(v=0.5, col="blue", lwd=2)
## Plot the histogram of p-values.
## Note: since all the data was generatd under H1,
## the distribution should be merely composed of low p-values.
hist(x.student$table$p.value, breaks=20, main="P-value distribution", xlab="p-value")
## Plot the empirical power curve beta = f(alpha).
## In this configuration where all features are under alternative hypothesis,
## this corresponds to a Receiver-Operator Characterisitic (ROC) curve
## With empirical TPR versus theoretical FPR.
plot(ecdf(x.student$table$p.value),
xlab=expression(FPR == alpha), ylab=expression(TPR == 1-beta),
main=paste("Student ROC curve"), col="blue")
grid()
abline(v=c(0,1))
abline(h=c(0,1))
abline(a=0,b=1, lty="dashed")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.