Description Usage Arguments Details Value Author(s) Examples
This function identifies the Core Fitness genes from a given Quantative knockout screen dependency matrix where each row is gene and each column the cell line. The function uses all the cell lines and identifies the genes that are essential in majority of the cell lines.
1 2 3 4 | ADAM2.PercentileAverageCF(depMat,
display=TRUE,
percentile=0.9,
prefix='PercentileAverageMethod')
|
depMat |
Quantative knockout screen dependency matrix where rows are genes and columns are samples. A real number in position [i,j] represents the strength of dependency which indicates the amaount of loss of fitness in the j-th sample in case of the inactivation of the i-th gene. Higher strength of dependency indicates higher probability of beign a core fitness gene. These values are used for ranking the genes in terms of their dependecy strength. |
display |
Boolean, default is TRUE. Should bar plots of the dependency profiles be plotted |
percentile |
percentage of the cell lines where the given gene should show depletion. The default value is 0.9 indicating least dependent 90th percentile cell line. |
prefix |
if the display is false the plots are generated in the working directory using the prefix. |
This function implements the idea that if a gene is essential then it should fall in the top Z most depleted genes in at least 90 For a given gene, we can rank its gene effect score in each cell line, then arrange cell lines in order of increasing gene effect score for that gene. The average ranks of the genes in the 90th percentile of least depleted genes are calculated. Z is choosen as the minimum density between the two gaussian distributions that are estimated from these average rankings. All genes with average rank less than this threshold in their 90th percentile least depleted cell lines are reported.
A list of the following vectors:
cfgenes |
Vector of number of genes that are core fitness genes |
LeastDependent |
A dataframe where each row corresponds to a gene.There are two columns: Value stores the average rank of the gene at the N-th percentile least dependent cell lines and the Gene stores the gene name |
threshold |
The rank threshold for core fitness genes |
C. Pacini, E. Karakoc & F. Iorio
1 2 3 | data(exampleSBFData)
results <- ADAM2.PercentileAverageCF(depMat=exampleSBFData)
cfgenes <- results$cfgenes
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.