simdat: Example Dataset

simdatR Documentation

Example Dataset

Description

The data is a simulated data set where the data matrix is generated from the latent factor model

Y = n^{1/2}U D V' + E Σ^{1/2}

where D and Σ are diagonal matrices, and U and V are orthogonal. V' means _V transposed_. For the factors, we include one giant factor, five useful factors, one harmful factor and one undetectable factor. For more details of the simulation method used, please refer to Appendix A.1 of Owen and Wang (2015) Bi-cross-validation for factor analysis, http://arxiv.org/abs/1503.03515.

Details

The dataset is a list of components:

  • Y a data matrix of 200 by 1000, where each row is a sample and each column is a variable

  • U the orthogonal factor matrix U of size 200 by 8.

  • V the orthogonal factor matrix V of size 1000 by 8.

  • D the vector of diagonal entries of D.

  • Sigma the vector of diagonal entries of Σ.

  • oracle.r the oracle rank (the optimal number of factors that should be kept) of the factor matrix.


esaBcv documentation built on June 30, 2022, 5:05 p.m.

Related to simdat in esaBcv...