R/BootFromCompromise.R

Defines functions BootFromCompromise

Documented in BootFromCompromise

#' @title
#' \code{BootFromCompromise}: Computes  Bootstrap replicates
#' of the (observation) factor scores by
#' creating bootstrapped compromises.
#'
#' @description
#' \code{BootFromCompromise} Computes observation Bootstrap replicates
#' of the factor scores from
#' bootstrapped compromises.
#'  \code{BootFromCompromise} is typically
#' used to create confidence intervals and to compute
#' Bootstrap ratios.
#'
#' @section Technicalities:
#' The input of \code{BootFromCompromise} is the original
#' \code{cubeOfData} used to compute the compromise
#' by the function \code{distatis}.
#' \code{BootFromCompromise} computes Bootstrap replicates
#' of the observations by randomly selecting the observations
#' with replacement.
#' The output of \code{BootFromCompromise} is a 3-way
#' array of dimensions "number of observations by number of
#' factors by number
#' of replicates."  The output is typically used to plot
#' confidence intervals
#' (i.e., ellipsoids or convex hulls)
#' or to compute \eqn{t}-like statistic
#' called \emph{bootstrap ratios}.
#'
#' To compute a bootstrapped sample,
#' a set of \eqn{K} distance matrices is
#' selected with replacement from the original set of \eqn{K} distance
#' matrices.
#' A \code{distatis} compromise is then computed and projected on
#' the factor space of the original solution to obtain
#' the bootstrapped factor
#' scores.
#'  This approach is also called \emph{total boostrap}
#'  by Lebart (2007,
#' see also Chateau and Lebart 1996, see also Abdi \emph{et al}.,
#' 2009 for an
#' example).  Compared to the partial bootstrap (see help for
#' \code{BootFactorScores}).
#' This approach has the desadvantage of being slow especially for
#' large data sets, but recent work (Cadoret & Husson, 2012)
#' suggests that
#' partial boostrap (i.e., computed from the partial
#' factor scores) could lead
#' to optimistic bootstrap estimates
#' when the number of distance matrices is
#' large and that it is preferable to use instead
#' the \emph{total boostrap}.
#'
#' @param LeCube2Distance
#' The array of distance used to call \code{distatis}
#' @param niter The number of bootstrap iterations (default = 1000)
#' @param Norm should be the same as for the original call
#' to \code{distatis}
#' @param Distance should be the same as for the original call to
#' \code{distatis}
#' @param RV should be the same
#' as for the original call to \code{distatis}
#' @param nfact2keep number of factors to keep for the results
#' @return the output is a 3-way array of dimensions
#' "number of observations by
#' number of factors by number of replicates."
#' @author Herve Abdi
#' @seealso \code{\link{BootFactorScores}}
#' \code{\link{GraphDistatisBoot}}.
#' @references Abdi, H., & Valentin, D., (2007).  Some new and easy ways to
#' describe, compare, and evaluate products and assessors.  In D., Valentin,
#' D.Z. Nguyen, L. Pelletier (Eds) \emph{New trends in sensory evaluation of
#' food and non-food products}.  Ho Chi Minh (Vietnam): Vietnam National
#' University-Ho chi Minh City Publishing House. pp. 5-18.
#'
#' Abdi, H., Dunlop, J.P., & Williams, L.J. (2009). How to compute reliability
#' estimates and display confidence and tolerance intervals for pattern
#' classiffers using the Bootstrap and 3-way multidimensional scaling
#' (DISTATIS). \emph{NeuroImage}, \bold{45}, 89--95.
#'
#' Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012). STATIS
#' and DISTATIS: Optimum multi-table principal component analysis and three way
#' metric multidimensional scaling. \emph{Wiley Interdisciplinary Reviews:
#' Computational Statistics}, \bold{4}, 124--167.
#'
#' These papers are available from 
#' \url{https://personal.utdallas.edu/~herve/}
#'
#' Additional references:
#'
#' Cadoret, M., Husson, F. (2012) Construction and evaluation of confidence
#' ellipses applied at sensory data. \emph{Food Quality and Preference},
#' \bold{28}, 106--115.
#'
#' Chateau, F., & Lebart, L. (1996). Assessing sample variability in the
#' visualization techniques related to principal component analysis: Bootstrap
#' and alternative simulation methods. In A. Prats (Ed.),\emph{Proceedings of
#' COMPSTAT 2006.} Heidelberg: Physica Verlag.
#'
#' Lebart, L. (2007). Which bootstrap for principal axes methods?  In
#' \emph{Selected contributions in data analysis and classification, COMPSTAT
#' 2006}.  Heidelberg: Springer Verlag.
#'
#' @keywords sample bootstrap
#' @examples
#' # 1. Load the Sort data set from the SortingBeer example
#' #    (available from the DistatisR package)
#' data(SortingBeer)
#' # Provide the "8 beers by 10 assessors" results of a sorting task
#' #-----------------------------------------------------------------------------
#' # 2. Create the set of distance matrices (one distance matrix per assessor)
#' #    (uses the function DistanceFromSort)
#' DistanceCube <- DistanceFromSort(Sort)
#'
#' #-----------------------------------------------------------------------------
#' # 3. Call the distatis function with the cube of distance as parameter
#' testDistatis <- distatis(DistanceCube)
#' # The factor scores for the beers are in
#' # testDistatis$res4Splus$F
#' # the partial factor scores for the beers for the assessors are in
#' #  testDistatis$res4Splus$PartialF
#' #
#' # 4. Get the bootstraped factor scores (with default 1000 iterations)
#' #    Here we use the "total bootstrap"
#'  F_fullBoot <- BootFromCompromise(DistanceCube,niter=1000)
#'
#' @export
BootFromCompromise <-
function(LeCube2Distance,niter =1000, Norm = 'MFA',
     Distance = TRUE, RV = TRUE, nfact2keep = 3){
#  Bootstrap Confidence interval for DISTATIS
# computed on the partial factor scores
# PartialFS is nI,nF,nK array of the partial factor scores
# with nI # of objects, nF # of factors, nK # of observations
# (obtained from DISTATIS program)
# niter: how many iterations? default =1000
print(c('Starting Full Bootstrap. Iterations #: ',niter),quote=FALSE)

# First call distatis to get the fixed effect model
FixedDist <- distatis(LeCube2Distance, Norm=Norm, Distance=Distance,RV=RV,nfact2keep=nfact2keep)
# Projection Matrix
  ProjMat = FixedDist$res4Splus$ProjectionMatrix
 # Initialize the Bootstrap Table
nI <- dim(LeCube2Distance)[1]
nK <- dim(LeCube2Distance)[3]
nF <- min(c(dim(ProjMat)[2],nfact2keep))
FullBootF <- array(0,dim = c(nI,nF,niter))
rownames(FullBootF) <- rownames(LeCube2Distance)
colnames(FullBootF) <- paste('Factor',1:nF)
# Iterate Bootstrap
for (n in 1:niter){
	ResBoot_n  <- distatis(LeCube2Distance[,,sample(nK,nK,TRUE)], Norm=Norm,Distance=Distance,RV=RV,nfact2keep=nfact2keep,compact=TRUE)
	FullBootF[,,n] <- ResBoot_n$res4Splus$Splus %*% ProjMat
    }
return(FullBootF)
}

Try the DistatisR package in your browser

Any scripts or data that you put into this service are public.

DistatisR documentation built on Dec. 5, 2022, 9:05 a.m.