networkCorrelationsSelection: Selection of Differentially Correlated Hub Sub-networks
In ClassifyR: A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing

Description Usage Arguments Details Value Author(s) References See Also Examples

Ranks sub-networks by largest within-class to between-class correlation variability and chooses the sub-networks which have the best resubstitution performance.

  ## S4 method for signature 'matrix'
networkCorrelationsSelection(measurements, classes, metaFeatures = NULL, ...)
  ## S4 method for signature 'DataFrame'
networkCorrelationsSelection(measurements, classes, metaFeatures = NULL,
                  featureSets, datasetName, trainParams, predictParams, resubstituteParams,
                  selectionName = "Differential Correlation of Sub-networks", verbose = 3)
  ## S4 method for signature 'MultiAssayExperiment'
networkCorrelationsSelection(measurements, target = NULL, metaFeatures = NULL, ...)

`measurements`	Either a `matrix`, `DataFrame` or `MultiAssayExperiment` containing the training data. For a `matrix`, the rows are features, and the columns are samples.
`classes`	Either a vector of class labels of class `factor` of the same length as the number of samples in `measurements` or if the measurements are of class `DataFrame` a character vector of length 1 containing the column name in `measurement` is also permitted. Not used if `measurements` is a `MultiAssayExperiment` object.
`metaFeatures`	A `DataFrame` with the same number of samples as the numeric table of interest. The number of derived features in this table will be different to the original input data table. The command `mcols(metaFeatureMeasurements)` must return a `DataFrame` which has an "original" column with as many rows as there are meta-features and specifies the feature which the meta-feature is originally derived from (e.g. network name).
`featureSets`	A object of type `FeatureSetCollection`. The `sets` slot must contain a list of two-column matrices with each row corresponding to a binary interaction. Such sub-networks may be determined by a community detection algorithm. This will be used to determine which features belong to which sub-networks before calculating a statistic for each sub-network.
`target`	If `measurements` is a `MultiAssayExperiment`, the name of the data table to be used.
`...`	Variables not used by the `matrix` nor the `MultiAssayExperiment` method which are passed into and used by the `DataFrame` method.
`datasetName`	A name for the data set used. Stored in the result.
`trainParams`	A container of class `TrainParams` describing the classifier to use for training.
`predictParams`	A container of class `PredictParams` describing how prediction is to be done.
`resubstituteParams`	An object of class `ResubstituteParams` describing the performance measure to consider and the numbers of top features to try for resubstitution classification.
`selectionName`	A name to identify this selection method by. Stored in the result.
`verbose`	Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3.

The selection of sub-networks is based on the average difference in correlations between each pair of interactors, considering the samples within each class separately. Such differences of correlations within each of the two classes are scaled by the average difference of correlations within each class.

More formally, let C_{i,j} be the correlation of the j-th edge using all samples belonging to to class i. Then, let mean(C_{i,*}) be defined as (C_{i,1} + C_{i,2} + ... + C_{i,e}) / e where e is the number of edges in the sub-network being considered. Also, let mean(C_{*,*}), the average overall correlation, be (C_{1,*} + C_{1,*}) / 2. Then, the between-class sum-of-squares (BSS) is (mean(C_{1,*}) - mean(C_{*,*})^2 + (mean(C_{2,*}) - mean(C_{*,*})^2. Also the within-class sum-of-squares (WSS) is sum(sum((C_{i,j} - mean(C_{i,*}))^2, j is 1 to e), i is 1 to 2). The sub-networks are ranked in decreasing order of BSS/WSS.

The classifier specified by trainParams and predictParams is used to calculate resubtitution error rates using the transformation of the data set provided by metaFeatures. The set of top-ranked sub-networks which give the lowest resubstitution error rate are finally selected.

Data tables which consist entirely of non-numeric data cannot be analysed. If measurements is an object of class MultiAssayExperiment, the factor of sample classes must be stored in the DataFrame accessible by the colData function with column name "class".

An object of class SelectResult or a list of such objects, if the classifier which was used for determining the specified performance metric made a number of prediction varieties.

Dario Strbenac

Network-based biomarkers enhance classical approaches to prognostic gene expression signatures, Rebecca L Barter, Sarah-Jane Schramm, Graham J Mann and Yee Hwa Yang, 2014, BMC Systems Biology, Volume 8 Supplement 4 Article S5, https://bmcsystbiol.biomedcentral.com/articles/10.1186/1752-0509-8-S4-S5.

interactorDifferences for an example of a function which can turn the measurements into meta-features for classification.

  networksList <- list(`A Hub` = matrix(c('A', 'A', 'A', 'B', 'C', 'D'), ncol = 2),
                       `G Hub` = matrix(c('G', 'G', 'G', 'H', 'I', 'J'), ncol = 2))
  netSets <- FeatureSetCollection(networksList)
                           
  # Differential correlation for sub-network with hub A.                                           
  measurements <- matrix(c(5.7, 10.1, 6.9, 7.7, 8.8, 9.1, 11.2, 6.4, 7.0, 5.5,
                           5.6, 9.6, 7.0, 8.4, 10.8, 12.2, 8.1, 5.7, 5.4, 12.1,
                           4.5, 9.0, 6.9, 7.0, 7.3, 6.9, 7.8, 7.9, 5.7, 8.7,
                           8.1, 10.6, 7.4, 7.1, 10.4, 6.1, 7.3, 2.7, 11.0, 9.1,
                           round(rnorm(60, 8, 1), 1)), ncol = 10, byrow = TRUE)
  classes <- factor(rep(c("Good", "Poor"), each = 5))
                         
  rownames(measurements) <- LETTERS[1:10]
  colnames(measurements) <- names(classes) <- paste("Patient", 1:10)
  
  Idifferences <- interactorDifferences(measurements, netSets)

  # The features are sub-networks and there are only two in this example.
  resubstituteParams <- ResubstituteParams(nFeatures = 1:2,
                                performanceType = "balanced error", better = "lower")
  
  predictParams <- PredictParams(NULL)
  networkCorrelationsSelection(measurements, classes, metaFeatures = Idifferences,
                               featureSets = netSets, datasetName = "Example",
                               trainParams = TrainParams(naiveBayesKernel),
                               predictParams = predictParams,
                               resubstituteParams = resubstituteParams)