This function calculates appropriate sample sizes for two-sample RNA-seq experiments for a desired power in which mean and dispersion vary among genes. Sample size calculations are performed at controlled false discovery rates, user-specified proportions of non-differentially expressed genes, mean counts in control group, dispersion, and fold change. A plot of power versus sample size is generated.

1 2 3 | ```
ssizeRNA_vary(nGenes = 10000, pi0 = 0.8, m = 200, mu, disp, fc,
up = 0.5, replace = TRUE, fdr = 0.05, power = 0.8, maxN = 35,
side = "two-sided", cex.title = 1.15, cex.legend = 1)
``` |

`nGenes` |
total number of genes, the default value is |

`pi0` |
proportion of non-differentially expressed genes,
the default value is |

`m` |
pseudo sample size for generated data. |

`mu` |
a vector (or scalar) of mean counts in control group from which to simulate. |

`disp` |
a vector (or scalar) of dispersion parameter from which to simulate. |

`fc` |
a vector (or scalar, or a function that takes an integer n and generates a vector of length n) of fold change for differentially expressed (DE) genes. |

`up` |
proportion of up-regulated genes among all DE genes,
the default value is |

`replace` |
sample with or without replacement from given parameters. See Details for more information. |

`fdr` |
the false discovery rate to be controlled. |

`power` |
the desired power to be achieved. |

`maxN` |
the maximum sample size used for power calculations. |

`side` |
options are "two-sided", "upper", or "lower". |

`cex.title` |
controls size of chart titles. |

`cex.legend` |
controls size of chart legend. |

If a vector is input for `pi0`

, sample size calculations
are performed for each proportion.

If the total number of genes is larger than length of `mu`

or
`disp`

, `replace`

always equals `TRUE`

.

`ssize` |
sample sizes (for each treatment) at which desired power is first reached. |

`power` |
power calculations with corresponding sample sizes. |

`crit.vals` |
critical value calculations with corresponding sample sizes. |

Ran Bi biran@iastate.edu, Peng Liu pliu@iastate.edu

Liu, P. and Hwang, J. T. G. (2007) Quick calculation for
sample size while controlling false discovery rate with application
to microarray analysis. *Bioinformatics* 23(6): 739-746.

Orr, M. and Liu, P. (2009) Sample size estimation while controlling
false discovery rate for microarray experiments using ssize.fdr package.
*The R Journal*, 1, 1, May 2009, 47-53.

Law, C. W., Chen, Y., Shi, W., Smyth, G. K. (2014). Voom: precision weights
unlock linear model analysis tools for RNA-seq read counts.
*Genome Biology* 15, R29.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ```
library(edgeR)
library(Biobase)
data(hammer.eset)
## load hammer dataset (Hammer, P. et al., 2010)
counts <- exprs(hammer.eset)[, phenoData(hammer.eset)$Time == "2 weeks"]
counts <- counts[rowSums(counts) > 0,]
trt <- hammer.eset$protocol[which(hammer.eset$Time == "2 weeks")]
mu <- apply(counts[, trt == "control"], 1, mean)
## average read count in control group for each gene
d <- DGEList(counts)
d <- calcNormFactors(d)
d <- estimateCommonDisp(d)
d <- estimateTagwiseDisp(d)
disp <- d$tagwise.dispersion
## dispersion for each gene
## fixed fold change
fc <- 2
size <- ssizeRNA_vary(mu = mu, disp = disp, fc = fc,
m = 30, maxN = 15, replace = FALSE)
size$ssize ## first sample size to reach desired power
size$power ## calculated power for each sample size
size$crit.vals ## calculated critical value for each sample size
## varying fold change
## fc1 <- function(x){exp(rnorm(x, log(2), 0.5*log(2)))}
## size1 <- ssizeRNA_vary(pi0 = 0.8, mu = mu, disp = disp, fc = fc1,
## m = 30, maxN = 20, replace = FALSE)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.