Description Usage Arguments Details Value Author(s) References See Also
View source: R/sampledModules.R
This function repeatedly resamples the samples (rows) in supplied data and identifies modules on the resampled data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | sampledBlockwiseModules(
datExpr,
nRuns,
startRunIndex = 1,
endRunIndex = startRunIndex + nRuns -1,
replace = FALSE,
fraction = if (replace) 1.0 else 0.63,
randomSeed = 12345,
checkSoftPower = TRUE,
nPowerCheckSamples = 2000,
skipUnsampledCalculation = FALSE,
corType = "pearson",
power = 6,
networkType = "unsigned",
saveTOMs = FALSE,
saveTOMFileBase = "TOM",
...,
verbose = 2, indent = 0)
|
datExpr |
Expression data. A matrix (preferred) or data frame in which columns are genes and rows ar samples. |
nRuns |
Number of network construction and module identification runs. |
startRunIndex |
Number to be assigned to the start run. The run number or index is used to make saved files unique; it has no effect on the actual results of the run. |
endRunIndex |
Number (index) of the last run. If given, |
replace |
Logical: should samples (observations or rows in entries in |
fraction |
Fraction of samples to sample for each run. |
randomSeed |
Integer specifying the random seed. If non-NULL, the random number generator state is saved before the seed is set
and restored at the end of the function. If |
checkSoftPower |
Logical: should the soft-tresholding power be adjusted to approximately match the connectivity distribution of the sampled data set and the full data set? |
nPowerCheckSamples |
Number of genes to be sampled from the full data set to calculate connectivity and match soft-tresholding powers. |
skipUnsampledCalculation |
Logical: should a calculation on original (not resampled) data be skipped? |
corType |
Character string specifying the correlation to be used. Allowed values are (unique
abbreviations of) |
power |
Soft-thresholding power for network construction. |
networkType |
network type. Allowed values are (unique abbreviations of) |
saveTOMs |
Logical: should the networks (topological overlaps) be saved for each run? Note that for large data sets (tens of thousands of nodes) the TOM files are rather large. |
saveTOMFileBase |
Character string giving the base of the file names for TOMs. The actual file names will consist of a
concatenation of |
... |
Other arguments to |
verbose |
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. |
indent |
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces. |
For each run, samples (but not genes) are randomly sampled to obtain a perturbed data set; a full network analysis and module identification is carried out, and the results are returned in a list with one component per run.
For each run, the soft-thresholding power can optionally be adjusted such that the mean adjacency in the re-sampled data set equals the mean adjacency in the original data.
A list with one component per run. Each component is a list with the following components:
mods |
The output of the function |
samples |
Indices of the samples selected for the resampled data step for this run. |
powers |
Actual soft-thresholding powers used in this run. |
Peter Langfelder
An application of this function is described in the motivational example section of
Langfelder P, Horvath S (2012) Fast R Functions for Robust Correlations and Hierarchical Clustering. Journal of Statistical Software 46(11) 1-17; PMID: 23050260 PMCID: PMC3465711
blockwiseModules
for the underlying network analysis and module identification;
sampledHierarchicalConsensusModules
for a similar resampling analysis of consensus networks.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.