# UVA: Unique Variable Analysis In EGAnet: Exploratory Graph Analysis - A Framework for Estimating the Number of Dimensions in Multivariate Data Using Network Psychometrics

## Description

Identifies redundant variables in a multivariate dataset using a number of different association methods and types of significance values (see Christensen, Garrido, & Golino, 2020 for more details)

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` ```UVA( data, n = NULL, model = c("glasso", "TMFG"), corr = c("cor_auto", "pearson", "spearman"), method = c("cor", "pcor", "wTO"), type = c("adapt", "alpha", "threshold"), sig, key = NULL, reduce = TRUE, reduce.method = c("latent", "remove", "sum"), lavaan.args = list(), adhoc = TRUE, plot.redundancy = FALSE, plot.args = list() ) ```

## Arguments

 `data` Matrix or data frame. Input can either be data or a correlation matrix `n` Numeric. If input in `data` is a correlation matrix, then sample size is required. Defaults to `NULL` `model` Character. A string indicating the method to use. Current options are: `glasso` Estimates the Gaussian graphical model using graphical LASSO with extended Bayesian information criterion to select optimal regularization parameter. This is the default method `TMFG` Estimates a Triangulated Maximally Filtered Graph `corr` Type of correlation matrix to compute. The default uses `cor_auto`. Current options are: `cor_auto` Computes the correlation matrix using the `cor_auto` function from `qgraph`. `pearson` Computes Pearson's correlation coefficient using the pairwise complete observations via the `cor` function. `spearman` Computes Spearman's correlation coefficient using the pairwise complete observations via the `cor` function. `method` Character. Computes weighted topological overlap (`"wTO"` using `EBICglasso`), partial correlations (`"pcor"`), and correlations (`"cor"`). Defaults to `"wTO"` `type` Character. Type of significance. Computes significance using the standard p-value (`"alpha"`), adaptive alpha p-value (`adapt.a`), or some threshold `"threshold"`. Defaults to `"adapt"` `sig` Numeric. p-value for significance of overlap (defaults to `.05`). Defaults for `"threshold"` for each `method`: `"wTO"` .20 `"pcor"` .20 `"cor"` .70 `key` Character vector. A vector with variable descriptions that correspond to the order of variables input into `data`. Defaults to `NULL` or the column names of `data` `reduce` Boolean. Should redundancy reduction be performed? Defaults to `TRUE`. Set to `FALSE` for redundancy analysis only `reduce.method` Character. How should data be reduced? Defaults to `"latent"` `"latent"` Redundant variables will be combined into a latent variable `"remove"` All but one redundant variable will be removed `"sum"` Redundant variables are combined by summing across cases (rows) `lavaan.args` List. If `reduce.method = "latent"`, then `lavaan`'s `cfa` function will be used to create latent variables to reduce variables. Arguments should be input as a list. Some example arguments (see `lavOptions for full details`): `estimator` Estimator to use for latent variables (see Estimators) for more details. Defaults to `"MLR"` for continuous data and `"WLSMV"` for mixed and categorical data. Data are considered continuous data if they have 6 or more categories (see Rhemtulla, Brosseau-Liard, & Savalei, 2012) `missing` How missing data should be handled. Defaults to `"fiml"` `std.lv` If `TRUE`, the metric of each latent variable is determined by fixing their (residual) variances to 1.0. If `FALSE`, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0. If there are multiple groups, `std.lv = TRUE` and `"loadings"` is included in the `group.label` argument, then only the latent variances i of the first group will be fixed to 1.0, while the latent variances of other groups are set free. Defaults to `TRUE` `adhoc` Boolean. Should adhoc check of redundancies be performed? Defaults to `TRUE`. If `TRUE`, adhoc check will run the redundancy analysis on the reduced variable set to determine if there are any remaining redundancies. This check is performed with the arguments: `method = "wTO"`, `type = "threshold"`, and `sig = .20`. This check is based on Christensen, Garrido, and Golino's (2020) simulation where these parameters were found to be the most conservative, demonstrating few false positives and false negatives `plot.redundancy` Boolean. Should redundancies be plotted in a network plot? Defaults to `FALSE` `plot.args` List. Arguments to be passed onto `ggnet2`. Defaults: `vsize = 6` `alpha = 0.4` `label.size = 5` `edge.alpha = 0.7`

## Value

Returns a list:

 `redundancy` A list containing several objects: `redudant` Vectors nested within the list corresponding to redundant nodes with the name of object in the list `data` Original data `correlation` Correlation matrix of original data `weights` Weights determine by weighted topological overlap, partial correlation, or zero-order correlation `network` If `method = "wTO"`, then the network computed following `EGA` with `EBICglasso` network estimation `plot` If `redundancy.plot = TRUE`, then a plot of all redundancies found `descriptives` basic A vector containing the mean, standard deviation, median, median absolute deviation (MAD), 3 times the MAD, 6 times the MAD, minimum, maximum, and critical value for the overlap measure (i.e., weighted topological overlap, partial correlation, or threshold) centralTendency A matrix for all (absolute) non-zero values and their respective standard deviation from the mean and median absolute deviation from the median `method` Returns `method` argument `type` Returns `type` argument `distribution` If `type != "threshold"`, then distribution that was used to determine significance `reduced` If `reduce = TRUE`, then a list containing: `data` New data with redundant variables merged or removed `merged`A matrix containing the variables that were decided to be redundant with one another `method`Method used to perform redundancy reduction `adhoc` If `adhoc = TRUE`, then the adhoc check containing the same objects as in the `redundancy` list object in the output

## Author(s)

Alexander Christensen <alexpaulchristensen@gmail.com>

## References

# Simulation using `UCA`
Christensen, A. P., Garrido, L. E., & Golino, H. (2020). Unique Variable Analysis: A novel approach for detecting redundant variables in multivariate data. PsyArXiv. doi: 10.31234/osf.io/4kra2

# Implementation of `UCA` (formally `node.redundant`)
Christensen, A. P., Golino, H., & Silvia, P. J. (2020). A psychometric network perspective on the validity and validation of personality trait questionnaires. European Journal of Personality, 34, 1095-1108. doi: 10.1002/per.2265

# wTO measure
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009). Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain. Proceedings of the National Academy of Sciences, 106, 22358-22363. doi: 10.1073/pnas.0911376106

# Selection of CFA Estimator
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354-373. doi: 10.1037/a0029315

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```# Select Five Factor Model personality items only idx <- na.omit(match(gsub("-", "", unlist(psychTools::spi.keys[1:5])), colnames(psychTools::spi))) items <- psychTools::spi[,idx] # Change names in redundancy output to each item's description key.ind <- match(colnames(items), as.character(psychTools::spi.dictionary\$item_id)) key <- as.character(psychTools::spi.dictionary\$item[key.ind]) if(interactive()){ UVA(data = items, method = "wTO", type = "adapt", key = key, reduce.method = "latent") } ```

EGAnet documentation built on Feb. 17, 2021, 1:06 a.m.