Description Usage Arguments Details Value Author(s) References Examples

Application of a PCovR analysis consists of the following steps: preprocessing the data, running PCovR analyses with different numbers of components and/or weighting parameter values, performing model selection, and rotating the retained solution for easier interpretation.

1 2 3 4 5 6 |

`X` |
Dataframe containing predictor scores |

`Y` |
Dataframe containing criterion scores |

`modsel` |
Model selection procedure ( |

`Rmin` |
Lowest number of components considered |

`Rmax` |
Highest number of components considered |

`R` |
Number of components (overrules |

`weight` |
Weighting values considered |

`rot` |
Rotation criterion ( |

`target` |
Target matrix for target rotation (components x predictor variables) |

`prepX` |
Preprocessing of predictor scores: standardizing ( |

`prepY` |
Preprocessing of criterion scores: standardizing ( |

`ratio` |
Ratio of the estimated error variances of the predictor block and the criterion block |

`fold` |
Value of |

`zeroloads` |
Number of near-zero loadings of the target for |

`x` |
An object of the type produced by |

`cpal` |
Vector of |

`lpal` |
Vector of line types used for model selection plots |

`...` |
Further graphical arguments |

The PCovR package includes two preprocessing options, which can be applied to `X` and/or `Y`. Specifically, it is possible to only center the data (`prepX="cent"`, `prepY="cent"`). However, the default option is to standardize the data (`prepX="stand"`, `prepY="stand"`), which implies that `X` and/or `Y` are centered and normalized (i.e., each variable has a mean of zero and a standard deviation of one).

The fastest and therefore default model selection setting (`modsel="seq"`) implies a sequential procedure in which the weighting value is determined on the basis of maximum likelihood principles (Vervloet, Van den Noortgate, Van Deun, & Ceulemans, 2013), but taking the weighting values entered by the user (i.e., specified with the parameter `weight`) into account. Specifically, if the weighting value does not equal one of those values, the entered weighting value that is closest to the maximum likelihood weighting value (in absolute sense) is used. Note that the default error variance ratio is estimated with the function `ErrorRatio`

, but can be specified otherwise with the parameter `ratio`. However, this is only possible for datasets with more observations than predictor variables. Among all models with the selected weighting value and a number of components between `Rmin` and `Rmax`, the solution is picked that has the highest *st* value (Cattell, 1966; Wilderjans, Ceulemans, & Meers, 2012). However, models for which the fit is less than 1% better than the fit of a less complex model are excluded. Note that the assessment of the optimal number of components can be overruled, in case one is only interested in the solutions with a particular number of components. In particular, when specifying the input parameter `R`, `Rmin` and `Rmax` will be ignored, and the specified number of components will be used when running the analysis and determining the weighting value.

The package also provides two sequential procedures that incorporate a cross-validation step (`modsel="seqRcv"` and `modsel="seqAcv"`). `seqRcv` also starts with the selection of the weighting value based on maximum likelihood principles, but in the next step, the number of components is determined using leave-one-out cross-validation. `seqAcv` is identical to the default procedure, but has an extra step: after the selection of the number of components, leave-one-out cross-validation is applied to choose the weighting value.

The simultaneous procedure (`modsel="sim"`) performs leave-one-out cross-validation for all considered weighting values (`weight`; by default, 100 values between .01 and 1) and all numbers of components between `Rmin` (default: 1) and `Rmax` (default: number of predictors divided by 3). The weighting parameter value and number of components that maximize the cross-validation fit are retained. Note that the parameter `fold` can be used to alter the number of roughly equal-sized parts in which the data are split for cross-validation (Hastie, Tibshirani, & Friedman, 2001). The default value of `fold` is `"LeaveOneOut"`, implying that the data is split in *N* (number of observations) parts.

The rotation criteria that are implemented in the PCovR package are `varimax`, `quartimin`, `targetT`, `targetQ`, `wvarim`

and `promin`

. One can also request the original solution by typing `rot="none"`. Target rotation (Browne, 1972) orthogonally rotates the loading matrix towards a target matrix (`target`) that is specified by the user.
Note that Simplimax requires the specification of a number of zero elements. By default, `zeroloads` equals the number of predictors.

The interpretation of the obtained solution usually starts with the interpretation of the loading matrix. Specifically, the components are labeled by considering what the predictors that have the highest loadings (in absolute sense), have in common. Given these labels, the regression weights can be interpreted.

`pcovr`

returns a list that contains the following objects (note that some objects can be empty, depending on the model selection settings used) :

`Px ` |
Loading matrix (components x predictor variables) |

`Py ` |
Regression weights matrix (components x criterion variables) |

`Te ` |
Component score matrix (observations x components) |

`W ` |
Component weights matrix (predictor variables x components) |

`Rx2 ` |
Proportion of explained variance in X |

`Ry2 ` |
Proportion of explained variance in Y |

`Qy2 ` |
Cross-validation fit as a function of weighting parameter and number of components (weighting parameter x number of components) |

`VAFsum` |
Weighted sum of the variance accounted for in X and in Y as a function of number of components (1 x number of components) |

`alpha` |
Selected value of the weighting parameter |

`R` |
Selected number of components |

`modsel` |
Model selection procedure that was used |

`rot` |
Rotation criterion that was used |

`prepX` |
Method of preprocessing that was used for the predictor scores |

`prepY` |
Method of preprocessing that was used for the criterion scores |

`Rvalues` |
Numbers of components that were considered |

`Alphavalues` |
Weighting parameter values that were considered |

Marlies Vervloet ([email protected])

Browne, M. W. (1972). Oblique rotation to a partially specified target. British Journal of Mathematical and Statistical Psychology , 25 (2), 207-212.

Cattell, R. B. (1966). The scree test for the number of factors. Multivariate behavioral research , 1 (2), 245-276.

De Jong, S., & Kiers, H. A. (1992). Principal covariates regression: Part I. Theory. Chemometrics and Intelligent Laboratory Systems , 155-164.

Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: Data mining, inference and prediction. New York: Springer.

Vervloet, M., Van Deun, K., Van den Noortgate, W., & Ceulemans, E. (2013). On the selection of the weighting parameter value in Principal Covariates Regression. Chemometrics and Intelligent Laboratory Systems.

Marlies Vervloet, Henk A. Kiers, Wim Van den Noortgate, Eva Ceulemans (2015). PCovR: An R Package for Principal Covariates Regression. Journal of Statistical Software, 65(8), 1-14. URL http://www.jstatsoft.org/v65/i08/.

Wilderjans, T. F., Ceulemans, E., & Meers, K. (2012). CHull: A generic convex-hull-based model selection method. Behavior research methods .

1 2 3 4 | ```
data(alexithymia)
results <- pcovr(alexithymia$X, alexithymia$Y)
summary(results)
plot(results)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.