PCA: Automatic module detection with PCA

Description Usage Arguments Details Value Author(s)

Description

Perform a principal component analysis to recover the module structure of a cloned network.

Usage

1
PCA(data, N = NULL, assignments, around = 5, rep = 1000, quantile = .95, rotate = "oblimin", stoppingrule = c("easystop", "parallel", "optimal"))

Arguments

data

Matrix of numerics. The data file.

N

Integer. The dimension of the original network. Only relevant if "easystop" or "optimal" are selected as a stopping rule.

assignments

Vector of integers. The correct clustering to recover, the output of function Assign, used to compute the adjusted Rand Index. Only relevant if "optimal" is selected as a stoppingrule. See details.

around

Integer. If "optimal" is selected as a stopping rule, severel numbers of components are considered, from N-around to N+around. See details.

rep

Integer. Iterations for parallel analysis.

quantile

Numeric in [0,1]. Quantile for the parallel analysis. Conventional values are .95 or .99.

rotate

Character string. The kind of rotation. An "oblimin" rotation is suggested.

stoppingrule

Vector of strings. Specify the stopping rule to determine the number of components. Can include one or more methods among "easystop", "parallel", and "optimal".

Details

The function performs a principal component analysis. Once the PCA is performed, each node is assigned to a module according to its highest component loading.

The number of components can be determined in three ways. If "easystop" is included in stoppinngrule, the correct number of factors (as specified by N) are kept. If "parallel" is included in stoppingrule, the number of components is determined using parallel analysis. If mehod "optimal" is included in stoppingrule, several numbers of factors are considered in the surroundings of N, from N-around to N+around. The number of factors that results in the best adjusted Rand index is retained (see adjustedRandIndex).

Value

Matrix of integers. A row for each cloned node (= a row for each column of the input matrix data), a column for each stopping rule. An integer indicates the belonging of each node to a module. Consider that the choice of the numbers is arbitrary, they are on a nominal scale. The only important thing is whether two nodes are in the same module or not.

Author(s)

Giulio Costantini


GiulioCostantini/TOMproject documentation built on May 6, 2019, 6:29 p.m.