# pickSoftThreshold: Analysis of scale free topology for soft-thresholding In WGCNA: Weighted Correlation Network Analysis

 pickSoftThreshold R Documentation

## Analysis of scale free topology for soft-thresholding

### Description

Analysis of scale free topology for multiple soft thresholding powers. The aim is to help the user pick an appropriate soft-thresholding power for network construction.

### Usage

```pickSoftThreshold(
data,
dataIsExpr = TRUE,
weights = NULL,
RsquaredCut = 0.85,
powerVector = c(seq(1, 10, by = 1), seq(12, 20, by = 2)),
removeFirst = FALSE, nBreaks = 10, blockSize = NULL,
corFnc = cor, corOptions = list(use = 'p'),
networkType = "unsigned",
moreNetworkConcepts = FALSE,
gcInterval = NULL,
verbose = 0, indent = 0)

pickSoftThreshold.fromSimilarity(
similarity,
RsquaredCut = 0.85,
powerVector = c(seq(1, 10, by = 1), seq(12, 20, by = 2)),
removeFirst = FALSE, nBreaks = 10, blockSize = 1000,
moreNetworkConcepts=FALSE,
verbose = 0, indent = 0)

```

### Arguments

 `data` expression data in a matrix or data frame. Rows correspond to samples and columns to genes. `dataIsExpr` logical: should the data be interpreted as expression (or other numeric) data, or as a similarity matrix of network nodes? `weights` optional observation weights for `data` to be used in correlation calculation. A matrix of the same dimensions as `datExpr`, containing non-negative weights. Only used with Pearson correlation. `similarity` similarity matrix: a symmetric matrix with entries between 0 and 1 and unit diagonal. The only transformation applied to `similarity` is raising it to a power. `RsquaredCut` desired minimum scale free topology fitting index R^2. `powerVector` a vector of soft thresholding powers for which the scale free topology fit indices are to be calculated. `removeFirst` should the first bin be removed from the connectivity histogram? `nBreaks` number of bins in connectivity histograms `blockSize` block size into which the calculation of connectivity should be broken up. If not given, a suitable value will be calculated using function `blockSize` and printed if `verbose>0`. If R runs into memory problems, decrease this value. `corFnc` the correlation function to be used in adjacency calculation. `corOptions` a list giving further options to the correlation function specified in `corFnc`. `networkType` network type. Allowed values are (unique abbreviations of) `"unsigned"`, `"signed"`, `"signed hybrid"`. See `adjacency`. `moreNetworkConcepts` logical: should additional network concepts be calculated? If `TRUE`, the function will calculate how the network density, the network heterogeneity, and the network centralization depend on the power. For the definition of these additional network concepts, see Horvath and Dong (2008). PloS Comp Biol. `gcInterval` a number specifying in interval (in terms of individual genes) in which garbage collection will be performed. The actual interval will never be less than `blockSize`. `verbose` integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. `indent` indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.

### Details

The function calculates weighted networks either by interpreting `data` directly as similarity, or first transforming it to similarity of the type specified by `networkType`. The weighted networks are obtained by raising the similarity to the powers given in `powerVector`. For each power the scale free topology fit index is calculated and returned along with other information on connectivity.

On systems with multiple cores or processors, the function pickSoftThreshold takes advantage of parallel processing if the function `enableWGCNAThreads` has been called to allow parallel processing and set up the parallel calculation back-end.

### Value

A list with the following components:

 `powerEstimate` estimate of an appropriate soft-thresholding power: the lowest power for which the scale free topology fit R^2 exceeds `RsquaredCut`. If R^2 is below `RsquaredCut` for all powers, `NA` is returned. `fitIndices` a data frame containing the fit indices for scale free topology. The columns contain the soft-thresholding power, adjusted R^2 for the linear fit, the linear coefficient, adjusted R^2 for a more complicated fit models, mean connectivity, median connectivity and maximum connectivity. If input `moreNetworkConcepts` is `TRUE`, 3 additional columns containing network density, centralization, and heterogeneity.

### Author(s)

Steve Horvath and Peter Langfelder

### References

Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17

Horvath S, Dong J (2008) Geometric Interpretation of Gene Coexpression Network Analysis. PLoS Comput Biol 4(8): e1000117

`adjacency`, `softConnectivity`