automatedStatistics: Perform the requested statistics for various studies /...

Description Usage Arguments Details Value Author(s) Examples

View source: R/cbaf-automatedStatistics.R


This function calculates frequency percentage, frequency ratio, mean value and median value of samples greather than specific cutoff in the selected study / subgroups of the study. Furthermore, it can looks for the five genes that contain the highest values in each study / study subgroup. It uses the data generated by obtainOneStudy()/obtainMultipleStudies() function.


automatedStatistics(submissionName, obtainedDataType =
  "multiple studies", calculate = c("frequencyPercentage", "frequencyRatio",
  "meanValue"), topGenes = TRUE, cutoff=NULL, round=TRUE)



a character string containing name of interest. It is used for naming the process.


a character string that specifies the type of input data produced by the previous function. Two options are availabe: "single study" for obtainOneStudy() and "multiple studies" for obtainMultipleStudies(). The function uses obtainedDataType and submissionName to construct the name of the BiocFileCach object and then finds the appropriate data inside it. Default value is multiple studies'.


a character vector that containes the statistical procedures users prefer the function to compute. The complete results can be obtained by c("frequencyPercentage", "frequencyRatio", "meanValue", "medianValue"). This will tell the function to compute the following: "frequencyPercentage", which is the percentge of samples having the value greather than specific cutoff divided by the total sample size for every study / study subgroup; "frequency ratio", which shows the number of selected samples divided by the total number of samples that give the frequency percentage for every study / study subgroup. It shows the selected and total sample sizes.; "Mean Value", that contains mean value of selected samples for each study; "Median Value", which shows the median value of selected samples for each study. The default input is calculate = c("frequencyPercentage", "frequencyRatio", "meanValue").


a logical value that, if set as TRUE, causes the function to create three data.frame that contain the five top genes for each cancer. To get all the three data.frames, "frequencyPercentage", "meanValue" and "MedianValue" must have been included for calculate.


a number used to limit samples to those that are greather than this number (cutoff). The default value for methylation data is 0.6 while gene expression studies use default value of 2. For methylation studies, it is average of relevant locations, for the rest, it is "log z-score". To change the cutoff to any desired number, change the option to cutoff = desiredNumber in which desiredNumber is the number of interest.


a logical value that, if set to be TRUE, will force the function to round all the calculated values to two decimal places. The default value is TRUE.


Package: cbaf
Type: Package
Version: 1.12.1
Date: 2020-12-07
License: Artistic-2.0


A new section in the BiocFileCache object that was created by one of the obtainOneStudy() or obtainMultipleStudies() functions. It contains a list that contains some or all of the following statistical measurements for every gene group, based on what user has chosen: Frequency.Percentage , Top.Genes.of.Frequency.Percentage, Frequency.Ratio, Mean.Value, Top.Genes.of.Mean.Value, Median, Top.Genes.of.Median.


Arman Shahrisa, [maintainer, copyright holder]

Maryam Tahmasebi Birgani,


genes <- list(K.demethylases = c("KDM1A", "KDM1B", "KDM2A", "KDM2B", "KDM3A",
 "KDM3B", "JMJD1C", "KDM4A"), K.methyltransferases = c("SUV39H1", "SUV39H2",
 "EHMT1", "EHMT2", "SETDB1", "SETDB2", "KMT2A", "KMT2A"))

obtainOneStudy(genes, "test", "Breast Invasive Carcinoma (TCGA, Cell 2015)",
"RNA-Seq", desiredCaseList = c(3,4))

automatedStatistics("test", obtainedDataType = "single study", calculate =
c("frequencyPercentage", "frequencyRatio"))

cbaf documentation built on Dec. 9, 2020, 2:02 a.m.