cca: Main function for CCA.

ccaR Documentation

Main function for CCA.

Description

Perform a correlational class analysis of the data, resulting in a partition of the data into separate modules. This consists of four steps:

  1. Create a matrix G of absolute row correlations. This is the network adjacency matrix.

  2. Set statistically insignificant correlations to 0 to reduce noise.

  3. Use igraph's leading.eigenvector.community to partition this network into modules.

  4. Return an object describing the resulting class assignments (as well as the separate data frames describing the individual modules.)

Usage

cca(dtf, filter.significance = TRUE, filter.value = 0.01, 
    zero.action = c("drop", "ownclass"), verbose = TRUE)

Arguments

dtf

The data frame containing the variables of interest.

filter.significance

Significance filtering sets "insignificant" ties to 0 to decrease noise and increase stability. Simulation results show that this greatly increases accuracy in many settings. Set filter.significance = FALSE to disable this.

filter.value

Minimum significance cutoff. Absolute row correlations below this value will be set to 0.

zero.action

What to do with 0-variance rows before partitioning the graph. If zero.action is "drop", CCA drop rows with 0 variance from the analyses (default). If zero.action is "ownclass", the correlations between 0-variance rows and all other rows is set to 0, and the correlations between all pairs of 0-var rows are set to 1. This effectively creates a "zero class".

verbose

Whether to print details of what CCA is doing to the screen.

Value

membership

The class assignments produced by CCA.

cormat

The row correlation matrix that was partitioned by CCA. It has a "dtf" attribute which holds the dataframe. Note that, if 0-variance were dropped, they will be excluded from the dataframe as well as the correlation matrix. The "zeros" attribute which holds the indexes of the dropped rows.

modules

For convenience, the dataframe is separated into the modules found by the algorithm. A separate dataframe for each module i can be found in modules[[i]]$dtf. The matrix of column correlations are in modules[[i]]$cormat. modules[[i]]$degenerate indicates whether this module contains undefined. Note that these modules can be plotted via the S3 plot method. See example below.

Author(s)

Andrei Boutyline, aboutyl@umich.edu.

References

Boutyline, Andrei. 2017. "Improving the Measurement of Shared Cultural Schemas with Correlational Class Analysis: Theory and Method." Sociological Science 4:353-93. https://www.sociologicalscience.com/articles-v4-15-353/

See Also

plot.cca

Examples

    data(cca.example)
    res1 <- cca(cca.example) # with igraph 0.7, this should find 3 classes of sizes 218 391 144.  
    plot(res1, 1) # plot them 
    plot(res1, 2)
    plot(res1, 3)

    print (round(res1$modules[[1]]$cormat, 2)) # examine the correlation matrix for the 1st module
    print (summary(res1$modules[[1]]$dtf)) # look at its variable ranges
    plot(res1, 1, bw = TRUE) # Plot it again in a more journal-friendly format.
    
    # now let's try setting the filter value too high
    res2 <- cca(cca.example, filter.value = 0.001) 
    # With igraph 0.7, the above now finds 17 classes 
    #    of sizes 216 1 1 371 1 1 1 1 1 1 1 1 11 141 1 1 2 
    # The small isolate classes can either be dropped manually, or by increasing filter.value 


corclass documentation built on Sept. 8, 2023, 5:50 p.m.