corfs.network: Network construction using the partial correlation based...

View source: R/corfs.network.R

Network construction using the partial correlation based forward regression or FBEDR Documentation

Network construction using the partial correlation based forward regression of FBED

Description

Network construction using the partial correlation based forward regression or FBED.

Usage

corfs.network(x, threshold = 0.05, tolb = 2, tolr = 0.02, stopping = "BIC", 
symmetry = TRUE, nc = 1) 

corfbed.network(x, threshold = 0.05, symmetry = TRUE, nc = 1)

Arguments

x

A matrix with continuous data.

threshold

Threshold ( suitable values in (0, 1) ) for assessing p-values significance. Default value is 0.05.

tolb

The difference in the BIC bewtween two successive values. By default this is is set to 2. If for example, the BIC difference between two succesive models is less than 2, the process stops and the last variable, even though significant does not enter the model.

tolr

The difference in the adjusted R^2 bewtween two successive values. By default this is is set to 0.02. If for example, the difference between the adjusted R^2 of two succesive models is less than 0.02, the process stops and the last variable, even though significant does not enter the model.

stopping

The stopping rule. The "BIC" is the default value, but can change to "ar2" and in this case the adjusted R^2 is used. If you want both of these criteria to be satisified, type "BICR2".

symmetry

In order for an edge to be added, a statistical relationship must have been found from both directions. If you want this symmetry correction to take place, leave this boolean variable to TRUE. If you set it to FALSE, then if a relationship between Y and X is detected but not between X and Y, the edge is still added.

nc

How many cores to use. This plays an important role if you have many variables, say thousands or so. You can try with nc = 1 and with nc = 4 for example to see the differences. If you have a multicore machine, this is a must option. There was an extra argument for plotting the skeleton but it does not work with the current visualisation packages, hence we removed the argument. Use plotnetwork to plot the skeleton.

Details

In the MMHC algorithm (see mmhc.skel), the MMPC or SES algorithms are run for every variable. Hence, one can use forward regression for each variable and this is what we are doing here. Partial correlation forward regression is very efficient, since only correlations are being calculated.

Value

A list including:

runtime

The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time.

density

The number of edges divided by the total possible number of edges, that is #edges / n(n-1)/2, where n is the number of variables.

info

Some summary statistics about the edges, minimum, maximum, mean, median number of edges.

ntests

The number of tests MMPC (or SES) performed at each variable.

G

The adjancency matrix. A value of 1 in G[i, j] appears in G[j, i] also, indicating that i and j have an edge between them.

Bear in mind that the values can be extracted with the $ symbol, i.e. this is an S3 class output.

Author(s)

Michail Tsagris

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr

References

Sanford Weisberg (2014). Applied Linear Regression. Hoboken NJ: John Wiley, 4th edition.

Draper N.R. and Smith H. (1988). Applied regression analysis. New York, Wiley, 3rd edition.

Tsamardinos, Brown and Aliferis (2006). The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning, 65(1), 31-78.

Borboudakis G. and Tsamardinos I. (2019). Forward-backward selection with early dropping. Journal of Machine Learning Research, 20(8): 1-39.

See Also

mmhc.skel, pc.skel

Examples

# simulate a dataset with continuous data
dataset <- matrix(runif(400 * 20, 1, 100), ncol = 20 ) 
a1 <- mmhc.skel(dataset, max_k = 3, threshold = 0.05, test = "testIndFisher", 
nc = 1) 
a2 <- corfs.network(dataset, threshold = 0.05, tolb = 2, tolr = 0.02, stopping = "BIC", 
symmetry = TRUE, nc = 1) 
a1$runtime  
a2$runtime 

MXM documentation built on Aug. 25, 2022, 9:05 a.m.