Nothing
#' @title Check Expression/methylation Profile for various subgroups of a cancer
#' study.
#'
#' @description This function Obtains the requested data for the given genes
#' across multiple subgroups of a cancer. It can check whether or not all genes
#' are included in subgroups of a cancer study and, if not, looks for the
#' alternative gene names. Then it calculates frequency percentage, frequency
#' ratio, mean value and median value of samples greather than specific value in
#' the selected subgroups of the cancer. Furthermore, it looks for the five
#' genes that comprise the highest values in each cancer study subgroup.
#'
#' @details
#' \tabular{lllll}{
#' Package: \tab cbaf \cr
#' Type: \tab Package \cr
#' Version: \tab 1.12.1 \cr
#' Date: \tab 2020-12-07 \cr
#' License: \tab Artistic-2.0 \cr
#' }
#'
#'
#' @include cbaf-obtainOneStudy.R cbaf-automatedStatistics.R
#' cbaf-heatmapOutput.R cbaf-xlsxOutput.R
#'
#' @usage processOneStudy(genesList, submissionName, studyName, desiredTechnique
#' , desiredCaseList = FALSE, validateGenes = TRUE, calculate =
#' c("frequencyPercentage", "frequencyRatio", "meanValue"), cutoff=NULL,
#' round=TRUE, topGenes = TRUE, shortenStudyNames = TRUE, geneLimit = FALSE,
#' rankingMethod = "variation", heatmapFileFormat = "TIFF", resolution = 600,
#' RowCex = "auto", ColCex = "auto", heatmapMargines = "auto",
#' rowLabelsAngle = 0, columnLabelsAngle = 45, heatmapColor = "RdBu",
#' reverseColor = TRUE, transposedHeatmap = FALSE, simplifyBy = FALSE,
#' genesToDrop = FALSE, transposeResults = FALSE)
#'
#'
#'
#' @param genesList a list that contains at least one gene group
#'
#' @param submissionName a character string containing name of interest. It is
#' used for naming the process.
#'
#' @param studyName a character string showing the desired cancer name. It is an
#' standard cancer study name that can be found on cbioportal.org, such as
#' \code{"Acute Myeloid Leukemia (TCGA, NEJM 2013)"}.
#'
#' @param desiredTechnique a character string that is one of the following
#' techniques: \code{"RNA-Seq"}, \code{"microRNA-Seq"}, \code{"microarray.mRNA"}
#' , \code{"microarray.microRNA"} or \code{"methylation"}.
#'
#' @param desiredCaseList a numeric vector that contains the index of desired
#' cancer subgroups, assuming the user knows index of desired subgroups. If not,
#' desiredCaseList is set to \code{"none"}, function will show the available
#' subgroups and ask the user to enter the desired ones during the
#' process. The default value is \code{"none"}.
#'
#' @param validateGenes a logical value that, if set to be \code{TRUE}, causes
#' the function to check each cancer study to find whether or not each gene has
#' a record. If a cancer doesn't have a record for specific gene, function looks
#' for alternative gene names that cbioportal might use instead of the given
#' gene name.
#'
#' @param calculate a character vector that containes the statistical procedures
#' users prefer the function to compute. The complete results can be obtained
#' by \code{c("frequencyPercentage", "frequencyRatio", "meanValue",
#' "medianValue")}. This will tell the function to compute the following:
#' \code{"frequencyPercentage"}, which is the percentge of samples having the
#' value greather than specific cutoff divided by the total sample size for
#' every study / study subgroup;
#' \code{"frequency ratio"}, which shows the number of selected samples divided
#' by the total number of samples that give the frequency percentage for every
#' study / study subgroup. It shows the selected and total sample sizes.;
#' \code{"Mean Value"}, that contains mean value of selected samples for each
#' study;
#' \code{"Median Value"}, which shows the median value of selected samples for
#' each study.
#' The default input is \code{calculate = c("frequencyPercentage",
#' "frequencyRatio", "meanValue")}.
#'
#' @param cutoff a number used to limit samples to those that are greather than
#' specific number (cutoff). The default value for methylation data is 0.6 while
#' gene expression studies use default value of 2. For methylation studies, it
#' is \code{average of relevant locations}, for the rest, it is
#' \code{"log z-score"}. To change the cutoff to any desired number, change the
#' option to \code{cutoff = desiredNumber}, in which desiredNumber is the
#' number of interest.
#'
#' @param round a logical value that, if set to be \code{TRUE}, will force the
#' function to round all the calculated values to two decimal places. The
#' default value is \code{TRUE}.
#'
#' @param topGenes a logical value that, if set as \code{TRUE}, causes the
#' function to create three dataframes that contain the five top genes for each
#' cancer. To get all the three dataframes, \code{"frequencyPercentage"},
#' \code{"meanValue"} and \code{"medianValue"} must have been included for
#' \code{"calculate"}.
#'
#' @param shortenStudyNames a logical vector. If the value is set as
#' \code{TRUE}, function will try to remove the last part of the cancer names
#' aiming to shorten them. The removed segment usually contains the name of
#' scientific group that has conducted the experiment.
#'
#' @param geneLimit if large number of genes exist in at least one gene group,
#' this option can be used to limit the number of genes that are shown on
#' heatmap. For instance, \code{geneLimit=50} will limit the heatmap to 50 genes
#' showing the most variation across multiple study / study subgroups. The
#' default value is \code{none}.
#'
#' @param rankingMethod a character value that determines how genes will be
#' ranked prior to drawing heatmap. \code{"variation"} orders the genes based on
#' unique values in one or few cancer studies while \code{"highValue"} ranks the
#' genes when they cotain high values in multiple / many cancer studies. This
#' option is useful when number of genes are too much so that user has to limit
#' the number of genes on heatmap by \code{geneLimit}.
#'
#' @param heatmapFileFormat This option enables the user to select the desired
#' image file format of the heatmaps. The default value is \code{"TIFF"}. Other
#' suppoeted formats include \code{"PNG"}, \code{"BMP"}, and \code{"JPG"}.
#'
#' @param resolution a number. This option can be used to adjust the resolution
#' of the output heatmaps as 'dot per inch'. The defalut value is 600.
#'
#' @param RowCex a number that specifies letter size in heatmap row names,
#' which ranges from 0 to 2. If \code{RowCex = "auto"}, the function will
#' automatically determine the best RowCex.
#'
#' @param ColCex a number that specifies letter size in heatmap column names,
#' which ranges from 0 to 2. If \code{ColCex = "auto"}, the function will
#' automatically determine the best ColCex.
#'
#' @param heatmapMargines a numeric vector that is used to set heatmap margins.
#' If \code{heatmapMargines = "auto"}, the function will automatically
#' determine the best possible margines. Otherwise, enter the desired margine as
#' e.g. c(10,10.)
#'
#' @param rowLabelsAngle a number that determines the angle with which the
#' gene names are shown in heatmaps. The default value is 0 degree.
#'
#' @param columnLabelsAngle a number that determines the angle with which the
#' studies/study subgroups names are shown in heatmaps. The default value is 45
#' degree.
#'
#' @param heatmapColor a character string that defines heatmap color. The
#' default value is \code{'RdBu'}. \code{'RdGr'} is also a popular color in
#' genomic studies. To see the rest of colors, please type
#' \code{library(RColorBrewer)} and then \code{display.brewer.all()}.
#'
#' @param reverseColor a logical value that reverses the color gradiant for
#' heatmap(s).
#'
#' @param transposedHeatmap a logical value that transposes heatmap rows to
#' columns and vice versa.
#'
#' @param simplifyBy a number that tells the function to change the values
#' smaller than that to zero. The purpose behind this option is to facilitate
#' recognizing candidate genes. Therefore, it is not suited for publications. It
#' has the same unit as \code{cutoff}.
#'
#' @param genesToDrop a character vector. Gene names within this vector will be
#' omitted from heatmap.The default value is \code{FALSE}.
#'
#' @param transposeResults a logical value that enables the function to replace
#' the columns and rows of data.
#'
#'
#'
#' @return a BiocFileCache object that containes some or all of the following
#' groups, based on what user has chosen: \code{ObtainedData},
#' \code{validationResults}, \code{frequencyPercentage},
#' \code{Top.Genes.of.Frequency.Percentage}, \code{frequencyRatio},
#' \code{meanValue}, \code{Top.Genes.of.Mean.Value}, \code{medianValue},
#' \code{Top.Genes.of.Median.Value}. It also saves these results in one excel
#' files for convenience. Based on preference, three heatmaps for frequency
#' percentage, mean value and median can be generated. If more than one group of
#' genes is entered, output for each group will be strored in a separate
#' sub-directory.
#'
#' @examples
#' genes <- list(K.demethylases = c("KDM1A", "KDM1B", "KDM2A", "KDM2B", "KDM3A",
#' "KDM3B", "JMJD1C", "KDM4A"), K.methyltransferases = c("SUV39H1", "SUV39H2",
#' "EHMT1", "EHMT2", "SETDB1", "SETDB2", "KMT2A", "KMT2A"))
#'
#' processOneStudy(genes, "test", "Breast Invasive Carcinoma (TCGA, Cell 2015)",
#' "RNA-Seq", desiredCaseList = c(2,3,4,5), calculate = c("frequencyPercentage",
#' "frequencyRatio"), heatmapMargines = c(16, 10), RowCex = 1, ColCex = 1)
#'
#' @author Arman Shahrisa, \email{shahrisa.arman@hotmail.com} [maintainer,
#' copyright holder]
#' @author Maryam Tahmasebi Birgani, \email{tahmasebi-ma@ajums.ac.ir}
#'
#' @export
################################################################################
################################################################################
###### Evaluation of Frequency, Mean and Median for Subgroups of a Cancer ######
################################################################################
################################################################################
processOneStudy <- function(
genesList,
submissionName,
studyName,
desiredTechnique,
desiredCaseList = FALSE,
validateGenes = TRUE,
calculate = c("frequencyPercentage", "frequencyRatio", "meanValue"),
cutoff=NULL,
round=TRUE,
topGenes = TRUE,
shortenStudyNames = TRUE,
geneLimit = FALSE,
rankingMethod = "variation",
heatmapFileFormat = "TIFF",
resolution = 600,
RowCex = "auto",
ColCex = "auto",
heatmapMargines = "auto",
rowLabelsAngle = 0,
columnLabelsAngle = 45,
heatmapColor = "RdBu",
reverseColor = TRUE,
transposedHeatmap = FALSE,
simplifyBy = FALSE,
genesToDrop = FALSE,
transposeResults = FALSE
){
##############################################################################
### Obtaining data
obtainOneStudy(
genesList = genesList,
submissionName = submissionName,
studyName = studyName,
desiredTechnique = desiredTechnique,
desiredCaseList = desiredCaseList,
validateGenes = validateGenes
)
message("")
##############################################################################
### Calculating statistics
automatedStatistics(
submissionName = submissionName,
obtainedDataType = "single study",
calculate = calculate,
cutoff = cutoff,
round = round,
topGenes = topGenes
)
message("")
##############################################################################
##############################################################################
### Create new directory for submission
present.directory <- getwd()
new.directory <- paste0(
present.directory, "/", submissionName, " output for a single study"
)
dir.create(new.directory, showWarnings = FALSE)
setwd(new.directory)
##############################################################################
### Preparing for heatmap output
heatmapOutput(
submissionName = submissionName,
shortenStudyNames = shortenStudyNames,
geneLimit = geneLimit,
rankingMethod = rankingMethod,
heatmapFileFormat = heatmapFileFormat,
resolution = resolution,
RowCex = RowCex,
ColCex = ColCex,
heatmapMargines = heatmapMargines,
rowLabelsAngle = rowLabelsAngle,
columnLabelsAngle = columnLabelsAngle,
heatmapColor = heatmapColor,
reverseColor = reverseColor,
transposedHeatmap = transposedHeatmap,
simplifyBy = simplifyBy,
genesToDrop = genesToDrop
)
message("")
##############################################################################
### Preparing for excel output
xlsxOutput(submissionName = submissionName,
transposeResults = transposeResults)
##############################################################################
##############################################################################
### Change the directory to the first directory
setwd(present.directory)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.