Description Usage Arguments Details Value Examples

A function to estitamete the sample size based on read counts and dispersion distribution in real data.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | ```
sample_size_distribution(
power = 0.8,
m = 10000,
m1 = 100,
f = 0.1,
k = 1,
w = 1,
rho = 2,
showMessage = FALSE,
storeProcess = FALSE,
distributionObject,
libSize,
minAveCount = 5,
maxAveCount = 2000,
repNumber = 100,
dispersionDigits = 1,
selectedGenes,
pathway,
species = "hsa",
countFilterInRawDistribution = TRUE,
selectedGeneFilterByCount = FALSE
)
``` |

`power` |
Power to detecte prognostic genes. |

`m` |
Total number of genes for testing. |

`m1` |
Expected number of prognostic genes. |

`f` |
FDR level |

`k` |
Ratio of sample size between two groups. |

`w` |
Ratio of normalization factors between two groups. |

`rho` |
minimum fold changes for prognostic genes between two groups. |

`showMessage` |
Logical. Display the message in the estimation process. |

`storeProcess` |
Logical. Store the power and n in sample size or power estimation process. |

`distributionObject` |
A DGEList object generated by est_count_dispersion function. RnaSeqSampleSizeData package contains 13 datasets from TCGA, you can set distributionObject as any one of "TCGA_BLCA","TCGA_BRCA","TCGA_CESC","TCGA_COAD","TCGA_HNSC","TCGA_KIRC","TCGA_LGG","TCGA_LUAD","TCGA_LUSC","TCGA_PRAD","TCGA_READ","TCGA_THCA","TCGA_UCEC" to use them. |

`libSize` |
numeric vector giving the total count for each sample. If not specified, the libsize in distributionObject will be used. |

`minAveCount` |
Minimal average read count for each gene. Genes with smaller read counts will not be used. |

`maxAveCount` |
Maximal average read count for each gene. Genes with larger read counts will be taken as maxAveCount. |

`repNumber` |
Number of genes used in estimation of read counts and dispersion distribution. |

`dispersionDigits` |
Digits of dispersion. |

`selectedGenes` |
Optianal. Name of interesed genes. Only the read counts and dispersion distribution for these genes will be used in power estimation. |

`pathway` |
Optianal. ID of interested KEGG pathway. Only the read counts and dispersion distribution for genes in this pathway will be used in power estimation. |

`species` |
Optianal. Species of interested KEGG pathway. |

`countFilterInRawDistribution` |
Logical. If the count filter will be applied on raw count distribution. If not, count filter will be applied on libSize scaled count distribution. |

`selectedGeneFilterByCount` |
Logical. If the count filter will be applied to selected genes when selectedGenes parameter was used. |

A function to estitamete the sample size based on read counts and dispersion distribution in real data.

Estimate sample size or a list including parameters and sample size in the process.

1 2 3 | ```
#Please note here the parameter repNumber was very small (5) to make the example code faster.
#We suggest repNumber should be at least set as 100 in real analysis.
sample_size_distribution(power=0.8,f=0.01,distributionObject="TCGA_READ",repNumber=5,showMessage=TRUE)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.