R/nScree.R
In nFactors: Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

#' Non Graphical Cattel's Scree Test
#'
#' The \code{nScree} function returns an analysis of the number of component or
#' factors to retain in an exploratory principal component or factor analysis.
#' The function also returns information about the number of components/factors
#' to retain with the Kaiser rule and the parallel analysis.
#'
#' The \code{nScree} function returns an analysis of the number of
#' components/factors to retain in an exploratory principal component or factor
#' analysis. Different solutions are given. The classical ones are the Kaiser
#' rule, the parallel analysis, and the usual scree test
#' (\code{\link{plotuScree}}).  Non graphical solutions to the Cattell
#' subjective scree test are also proposed: an acceleration factor (\emph{af})
#' and the optimal coordinates index \emph{oc}. The acceleration factor
#' indicates where the elbow of the scree plot appears. It corresponds to the
#' acceleration of the curve, i.e. the second derivative.  The optimal
#' coordinates are the extrapolated coordinates of the previous eigenvalue that
#' allow the observed eigenvalue to go beyond this extrapolation. The
#' extrapolation is made by a linear regression using the last eigenvalue
#' coordinates and the \eqn{k+1} eigenvalue coordinates. There are \eqn{k-2}
#' regression lines like this.  The Kaiser rule or a parallel analysis
#' criterion (\code{\link{parallel}}) must also be simultaneously satisfied to
#' retain the components/factors, whether for the acceleration factor, or for
#' the optimal coordinates.
#'
#' If \eqn{\lambda_i} is the \eqn{i^{th}} eigenvalue, and \eqn{LS_i} is a
#' location statistics like the mean or a centile (generally the followings:
#' \eqn{1^{st}, \ 5^{th}, \ 95^{th}, \ or \ 99^{th}}).
#'
#' The Kaiser rule is computed as: \deqn{ n_{Kaiser} = \sum_{i} (\lambda_{i}
#' \ge \bar{\lambda}).} Note that \eqn{\bar{\lambda}} is equal to 1 when a
#' correlation matrix is used.
#'
#' The parallel analysis is computed as: \deqn{n_{parallel} = \sum_{i}
#' (\lambda_{i} \ge LS_i).}
#'
#' The acceleration factor (\eqn{AF}) corresponds to a numerical solution to
#' the elbow of the scree plot: \deqn{n_{AF} \equiv \ If \ \left[ (\lambda_{i}
#' \ge LS_i) \ and \ max(AF_i) \right].}
#'
#' The optimal coordinates (\eqn{OC}) corresponds to an extrapolation of the
#' preceeding eigenvalue by a regression line between the eigenvalue
#' coordinates and the last eigenvalue coordinates: \deqn{n_{OC} = \sum_i
#' \left[(\lambda_i \ge LS_i) \cap (\lambda_i \ge (\lambda_{i \ predicted})
#' \right].}
#'
#'
#' @param eig depreciated parameter (use x instead): eigenvalues to analyse
#' @param x numeric: a \code{vector} of eigenvalues, a \code{matrix} of
#' correlations or of covariances or a \code{data.frame} of data
#' @param aparallel numeric: results of a parallel analysis.  Defaults
#' eigenvalues fixed at \eqn{\lambda >= \bar{\lambda}} (Kaiser and related
#' rule) or \eqn{\lambda >= 0} (CFA analysis)
#' @param cor logical: if \code{TRUE} computes eigenvalues from a correlation
#' matrix, else from a covariance matrix
#' @param model character: \code{"components"} or \code{"factors"}
#' @param criteria numeric: by default fixed at \eqn{\bar{\lambda}}.  When the
#' \eqn{\lambda}s are computed from a principal component analysis on a
#' correlation matrix, it corresponds to the usual Kaiser \eqn{\lambda >= 1}
#' rule. On a covariance matrix or from a factor analysis, it is simply the
#' mean.  To apply \eqn{\lambda >= 0}, sometimes used with factor analysis, fix
#' the criteria to \eqn{0}.
#' @param ...  variabe: additionnal parameters to give to the \code{cor} or
#' \code{cov} functions
#' 
#' 
#' @return 
#' \item{Components }{ Data frame for the number of components/factors
#' according to different rules } \item{Components$noc }{ Number of
#' components/factors to retain according to optimal coordinates \emph{oc}}
#' \item{Components$naf }{ Number of components/factors to retain according to
#' the acceleration factor \emph{af}} \item{Components$npar.analysis }{Number
#' of components/factors to retain according to parallel analysis }
#' \item{Components$nkaiser }{ Number of components/factors to retain according
#' to the Kaiser rule } \item{Analysis }{ Data frame of vectors linked to the
#' different rules } \item{Analysis$Eigenvalues }{ Eigenvalues }
#' \item{Analysis$Prop }{ Proportion of variance accounted by eigenvalues }
#' \item{Analysis$Cumu }{ Cumulative proportion of variance accounted by
#' eigenvalues } \item{Analysis$Par.Analysis }{ Centiles of the random
#' eigenvalues generated by the parallel analysis. } \item{Analysis$Pred.eig }{
#' Predicted eigenvalues by each optimal coordinate regression line }
#' \item{Analysis$OC}{ Critical optimal coordinates \emph{oc}}
#' \item{Analysis$Acc.factor }{ Acceleration factor \emph{af}}
#' \item{Analysis$AF}{ Critical acceleration factor \emph{af}} Otherwise,
#' returns a summary of the analysis.
#' 
#' @author Gilles Raiche \cr Centre sur les Applications des Modeles de
#' Reponses aux Items (CAMRI) \cr Universite du Quebec a Montreal\cr
#' \email{raiche.gilles@@uqam.ca}
#' 
#' @seealso \code{\link{plotuScree}}, \code{\link{plotnScree}},
#' \code{\link{parallel}}, \code{\link{plotParallel}},
#' 
#' @references 
#' Cattell, R. B. (1966). The scree test for the number of factors.
#' \emph{Multivariate Behavioral Research, 1}, 245-276.
#'
#' Dinno, A. (2009). \emph{Gently clarifying the application of Horn's parallel
#' analysis to principal component analysis versus factor analysis}.  Portland,
#' Oregon: Portland Sate University.
#'
#' Guttman, L. (1954). Some necessary conditions for common factor analysis.
#' \emph{Psychometrika, 19, 149-162}.
#'
#' Horn, J. L. (1965). A rationale for the number of factors in factor
#' analysis.  \emph{Psychometrika, 30}, 179-185.
#'
#' Kaiser, H. F. (1960). The application of electronic computer to factor
#' analysis.  \emph{Educational and Psychological Measurement, 20}, 141-151.
#'
#' Raiche, G., Walls, T. A., Magis, D., Riopel, M. and Blais, J.-G. (2013). Non-graphical solutions
#' for Cattell's scree test. Methodology, 9(1), 23-29.
#'
# #' @family nScree
#' @export
#' @importFrom stats lm coef
#' @keywords multivariate
#' @examples
#'
#' ## INITIALISATION
#'  data(dFactors)                      # Load the nFactors dataset
#'  attach(dFactors)
#'  vect         <- Raiche              # Uses the example from Raiche
#'  eigenvalues  <- vect$eigenvalues    # Extracts the observed eigenvalues
#'  nsubjects    <- vect$nsubjects      # Extracts the number of subjects
#'  variables    <- length(eigenvalues) # Computes the number of variables
#'  rep          <- 100                 # Number of replications for PA analysis
#'  cent         <- 0.95                # Centile value of PA analysis
#'
#' ## PARALLEL ANALYSIS (qevpea for the centile criterion, mevpea for the
#' ## mean criterion)
#'  aparallel    <- parallel(var     = variables,
#'                           subject = nsubjects,
#'                           rep     = rep,
#'                           cent    = cent
#'                           )$eigen$qevpea  # The 95 centile
#'
#' ## NUMBER OF FACTORS RETAINED ACCORDING TO DIFFERENT RULES
#'  results      <- nScree(x=eigenvalues, aparallel=aparallel)
#'  results
#'  summary(results)
#'
#' ## PLOT ACCORDING TO THE nScree CLASS
#'  plotnScree(results)
#'
"nScree" <-
function(eig=NULL, x=eig, aparallel = NULL, cor=TRUE, model="components", criteria=NULL, ...) {
 # Initialisation
 eig        <- eigenComputes(x, cor=cor, model=model, ...)
 if (is.null(aparallel)) aparallel <- rep(1,length(eig))  # default to 1 in the diagonal
 nk         <- length(eig)
 k          <- 1:nk
 proportion <- eig/sum(eig)
 cumulative <- proportion
 if (is.null(criteria)) criteria <- mean(eig)
 for (i in 2:nk) cumulative[i] = cumulative[i-1] + proportion[i]
 proportion[proportion < 0] <- 0# To constraint negative proportions to be zero
 cond1      <- TRUE; cond2 <- TRUE; i <- 0; pred.eig <- af <- rep(NA,nk)
 while ((cond1 == TRUE) && (cond2 == TRUE) && (i < nk)) {
  i           <- i + 1
  ind         <- k[c(i+1,nk)]
  #### Optimal coordinate based on the next eigenvalue regression (scree)
  vp.p        <- lm(eig[c(i+1,nk)] ~ ind)
  vp.prec     <- pred.eig[i] <- sum(c(1,i)* coef(vp.p))
  cond1       <- (eig[i] >=  vp.prec)
  cond2       <- (eig[i] >= aparallel[i])
  nc          <- i-1
  }
  # Second derivative at the i eigenvalue (acceleration factor, elbow)
  # See Yakowitz and Szidarovszky (1986, p. 84)
  tag       <- 1
  for (j in 2:(nk-1)) {
    if (eig[j-1] >= aparallel[j-1]) {
      af[j]   <- (eig[j+1] -2* eig[j]) + eig[j-1]
    }
   }

  if (model == "components") p.vec <- which(eig >= aparallel,TRUE) else p.vec <- which((eig-aparallel)>=0 & eig >= criteria)
  ###if (model == "components") p.vec <- which(eig >= aparallel,TRUE) else p.vec <- which((eig-aparallel)>=0 & eig > 0)
  npar      <- sum(p.vec == (1:length(p.vec)))
  nkaiser <- sum(eig >= rep(criteria,nk))
  #### if (model == "components") nkaiser <- sum(eig >= rep(criteria,nk)) else nkaiser <- sum(eig >= rep(0,nk))
  #### if (model == "components") nkaiser <- sum(eig >= rep(1,nk))        else nkaiser <- sum(eig >= rep(mean(eig),nk))
  naf       <- which(af == max(af,na.rm=TRUE),TRUE) - 1

  # Assure that all the optimal coordinates will be computed
  for (i in (nc+1):(nk-2)) {
   ind        <- k[c(i+1,nk)]
   vp.p       <- lm(eig[c(i+1,nk)] ~ ind)
   vp.prec    <- pred.eig[i] <- sum(c(1,i)* coef(vp.p))
   }

  # Assure that all the acceleration factors will be computed
  for (j in 2:(nk-1)) af[j] <- (eig[j+1] - 2 * eig[j]) + eig[j-1]

  # Return values by the function
  coc        <- rep("",nk); coc[nc]  = "(< OC)"
  caf        <- rep("",nk); caf[naf] = "(< AF)"
  result     <- (list(Components = data.frame(noc          = nc,
                                              naf          = naf,
                                              nparallel    = npar,
                                              nkaiser      = nkaiser),
                      Analysis   = data.frame(Eigenvalues  = eig,
                                              Prop         = proportion,
                                              Cumu         = cumulative,
                                              Par.Analysis = aparallel,
                                              Pred.eig     = pred.eig,
                                              OC           = coc,
                                              Acc.factor   = af,
                                              AF           = caf),
                      Model      = model))

  class(result) <- 'nScree'
  return(result)
}

Any scripts or data that you put into this service are public.

nFactors documentation built on Oct. 10, 2022, 5:07 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nFactors
Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

R/nScree.R
In nFactors: Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

Try the nFactors package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

nFactors Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

R/nScree.R In nFactors: Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

Try the nFactors package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

nFactors
Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test

R/nScree.R
In nFactors: Parallel Analysis and Other Non Graphical Solutions to the Cattell Scree Test