generalPatternClustering: Clustering dose-response curves based on their pattern
In clustDRM: Clustering Dose-Response Curves and Fitting Appropriate Models to Them

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/generalPatternClustering.R

function to cluster dose-response curves based on their pattern.

generalPatternClustering(inputData, colsData, colID, doseLevels,
  numReplications, na.rm = FALSE, imputationMethod = c("mean",
  "median"), ORICC = c("two", "one", "both"), transform = c("none",
  "log", "sRoot", "qRoot", "boxcox"), plotFormat = c("eps", "jpg"),
  LRT = TRUE, MCT = FALSE, adjustMethod = c("BH", "holm", "hochberg",
  "hommel", "bonferroni", "BY", "fdr", "none"), nPermute = 1000,
  useSeed = NULL, theLeastNumberOfMethods = c(1, 2, 3, 4),
  alpha = 0.05, nCores = 1)

`inputData`	data matrix which should incluide ID's of the subjects, as well as the measurements (gene expressions, etc.) for all replications of different as columns.
`colsData`	vector indicating the idex of columns in the inputData which correspond to the measurement for different replications of different doses.
`colID`	scalar indicating the index of column corresponding to data ID.
`doseLevels`	vector with dose levels.
`numReplications`	vector wit hthe same length as doseLevels with number of replications for each dose.
`na.rm`	logical variable indicatign whether missing values should be removed (TRUE) or not (FALSE, default)
`imputationMethod`	signle string taking calues from "mean" (default), and "median", which indicates how the missing values should be treated. "mean" would replace them with the mean of the observed ones, and "median" will use median of them for imputation.
`ORICC`	signle string taking value "two", "one", and "both", indicating which ORICC procedure should be used. "one" refers to one-stage ORICC only, "two" (default) refers to two-stage ORICC only, and "both" will perform both of them.
`transform`	single string indicating what kind of transform should be applied on the response data. It takes "none" (no transform, dafault), "log" (natural log), "sRoot (square root), and "qRoot" (cubic root), and "boxcox" (Box-Cox transformation).
`plotFormat`	plotFormat string gets two values "eps" (default), and "jpg" indicating the format of the ouput plot.
`LRT`	logical indicating whether a permutation-based likelihood ratio test should be applied (TRUE) on the subjects which their trend is identified as non-flast by ORICC1 or not (FALSE).
`MCT`	logical indicating whether a multiple comparison test (with "UmbrellaWilliams" constrast matrix) should be applied (TRUE) on the subjects which their trend is identified as non-flast by ORICC1 or not (FALSE).
`adjustMethod`	The method for multiplicity adjustment for p-values. The possible values for this argument are "BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none" with "BH" (Benjamini-Hocheberg) as default.
`nPermute`	scalar indicating number of permutations in LRT.
`useSeed`	scalar, indicating the seed should be used to generate LRT permutations. The default is NULL.
`theLeastNumberOfMethods`	scalar taking values from 1, 2, 3, and 4, indicating how many methods should approve a non-flat trend that it can be selected. Its value depends on how many tests are asked to be done, for the maximum happens when ORICC = "both" and both LRT and MCT are TRUE. For example, when this argument sets to 2 and ORICC = "two", LRT = TRUE, and MCT = TRUE, it means if two-stage ORICC identifies a non-flat pattern and at least one of the LRT and MCT also accepts (at the level of alpha), then that comound is selected as one with a non-flat pattern. Note that the comparison with alpha is done for adjusted p-values.
`alpha`	the significance level to compare the adjusted p-value with it.
`nCores`	nCores scalar, indicating the number of cores should be used to perform LRT and MCT tests. Default is 1 which means sequantial computation (no prallel computation).

This function first use ORIIC1 or ORICC2 (or both) to identify the pattern of the dose-response cruve for each subject. Once the pattern is identified, for non-flat ones, a permutation-based likelihood ratio test (for exactly the identified pattern, if LRT = TRUE), and a multiple comparisons test (to test H0: flat vs. H1: non-flat, if MCT = TRUE) will be performed to further filter the flat patterns.

a list of the following objects:

selectedSubjects: a data frame indicating the ID's of the selected subjects in the first columns and the identified trend in the second column.

clusteringORICC1Results and/or clusteringORICC2Results: a list with four elements providing the raw data as the outcome of the ORICC procedure (rawDataORICC1 and/or rawDataORICC2), the pattern identified by the ORICC procedure (clusteringResultsORICC1 and/or clusteringResultsORICC2), results of LRT (resultsLRT) and results of MCT (resultsMCT). Both of them provide the adjusted and unadjusted p-values, but for MCT the selected contrast will be provided as well.

Vahid Nassiri, and Yimer Wasihun.

ORIClust ORIClust

## gnerating data
set.seed(11)
doses2Use <-  c(0, 5, 20)
numRep2Use <- c(3, 3, 3)
generatedData <- cbind(rep(1,sum(numRep2Use)),
MCPMod::genDFdata("logistic",c(5, 3, 10, 0.05), doses2Use, 
numRep2Use, 1), 
		matrix(rnorm(1*sum(numRep2Use)), sum(numRep2Use), 1))
colnames(generatedData) <- c("ID", "dose", "response", "x1")
for (iGen in 2:15){
	genData0 <- cbind(rep(iGen,sum(numRep2Use)),
MCPMod::genDFdata("logistic",c(5, 3, 10, 0.05), doses2Use, 
numRep2Use, 1), 
			matrix(rnorm(1*sum(numRep2Use)), sum(numRep2Use), 1))
	colnames(genData0) <- c("ID", "dose", "response", "x1")
	generatedData <- rbind(generatedData, genData0)
}
## transforming it for clustering
toInput <- inputDataMaker(2, 3, 1, generatedData)
## general pattern clustering
generalPatternClust <- generalPatternClustering(inputData = toInput$inputData, 
colsData = toInput$colsData ,colID = toInput$colID , 
		doseLevels = toInput$doseLevels, numReplications = toInput$numReplicates, 
na.rm = FALSE, imputationMethod = "mean",
		ORICC = "two", transform = "none",plotFormat = "eps", 
LRT = TRUE, MCT = TRUE,
		adjustMethod = "BH",
		nPermute = 100, useSeed = NULL, 
theLeastNumberOfMethods = 2, alpha = 0.05, nCores = 1)