gslcca: Perform a Generalised Semi Linear Canonical Correlation...
In gslcca: Generalized Semi-Linear Canonical Correlation Analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

Performs (Generalised) Semi Linear Canonical Correlation Analysis, i.e. computes the canonical correlation between a data matrix and a nonlinear function of time. SLCCA is extended by allowing parameters to vary by a treatment factor and allowing adjustment for covariates. Shortcuts are provided for PK/PD models suitable for analysing EEG data.

gslcca(Y, formula = "Double Exponential", time, subject = NULL, global = FALSE,
    treatment = NULL, ref = 1, separate = FALSE, partial = ~1, data = NULL, 
    subset = NULL, global.smooth = FALSE, subject.smooth = TRUE, 
    pct.explained = 0.96, start = NULL, method = "L-BFGS-B", lower = 2,
    upper = 15, ...)

`Y`	a data matrix.
`formula`	either a nonlinear model formula specifying a function of `time` including parameters, or one of the character strings `"Double Exponential"` or `"Critical Exponential"` to specify the corresponding PK/PD model (see Details).
`time`	a vector of time values corresponding to the rows of `Y`.
`subject`	an optional factor grouping the rows of Y. If specified, a separate set of coefficients of Y will be estimated for each subject.
`global`	if `subject` is specified, whether to fit a global model for all subjects, i.e. whether the nonlinear model and associated coefficients should be the same for all subjects.
`treatment`	an optional factor (nested within `subject` if specified) identifying groups for which separate parameters should be estimated.
`ref`	the reference level of `treatment` for which `formula` is set to `~ 0` (may be `NULL`).
`separate`	if `treatment` is specified, whether to estimate separate parameters of `formula` and separate canonical coefficients for each treatment or just separate canonical coefficients.
`partial`	a linear model formula specifying covariates to partial out of the CCA analysis (may be `NULL`).
`data`	an optional data frame in which to evaluate the variables in `time`, `subject`, `treatment` and `partial`.
`subset`	an optional logical or numeric vector specifying a subset of observations to be used in the fitting process.
`global.smooth`	controls the smoothing of `Y` via SVD. Must be one of `FALSE` for no smoothing, `TRUE` for smoothing with an automatically selected number of roots, or a scalar specifying the number of roots to use.
`subject.smooth`	controls the smoothing of `Y` within the levels of `subject`. Accepts the same values as `global.smooth`.
`pct.explained`	a scalar between 0 and 1 indicating the desired minimum percentage of variance explained by the SVD approximation when `subject.smooth` or `global.smooth` is `TRUE`.
`start`	a named list of starting values for the parameters. Each element must have the same length as the number of non-reference levels of `treatment` or have length one, in which case the same starting values are used for each non-reference treatment level. May be `NULL` when `formula` is a character string, however it is recommended that starting values are always provided.
`method`	the method to be used by `optim` in finding the nonlinear parameters.
`lower, upper`	bounds on the nonlinear parameters.
`...`	arguments passed on to `optim`.

The function fits the following model:

Y(t) * A = X(t, theta) * B

where Y(t) is a data matrix with rows of observations recorded at times t, A is a vector of loadings, X(t, theta) is a matrix with columns containing a nonlinear function with unknown parameters for each non-reference treatment level, and B is a vector of coefficients.

The parameters A, theta and B are estimated to optimise the correlation between the left- and right-hand sides of the model.

If partial specifies a matrix of covariates, G, to be partialled out, then the canonical correlation analysis is based on the residuals from the multivariate linear models lm(Y ~ 0 + G) and lm(X ~ 0 + G) When partial = ~1 this is equivalent to centering the columns of Y and X.

The nonlinear function defining X may be specified by a character string for two pharmacokinetic/pharmacodynamic models:

"Double Exponential": ~ exp(-time/exp(K1))-exp(-time/exp(K2))
"Critical Exponential": ~ time*exp(-time/exp(K1))

The data matrix Y may be pre-smoothed to remove global artefacts. This is achieved by approximating the data from the number of SVD roots specified by global.smooth. If the number of roots is not specified explicitly, k roots are selected such that lambda_k/lambda_1 >0.001 and (sum_{j=1}^k lambda_j)/(sum_{j=1}^r lambda_j) >= p where k = 1, ..., r, r is the rank of Y, lambda_1, lambda_2, ... are the eigenvalues of the variance matrix and p is given by pct.explained.

The data are then analysed for each level of subject separately and the data may again be smoothed at this level to remove local artefacts as specified by local.smooth.

The coefficients of the left-hand side matrix are sometimes referred to as loadings and are described in Brain et al (2011) as signatures.

The projected values Y(t) * A and X(t, theta) * B are known as the y scores and x scores respectively. In this context, they can be thought of as the (projected) observed and fitted values.

Signatures, observed and fitted values can be visualised using plot.gslcca.

An object of class "gslcca", which is a list with components

`call`	the call to `gslcca`.
`ycoef`	a matrix of the coefficients of the left-hand side matrix, with one column per subject.
`xcoef`	a matrix of the coefficients of the left-hand side matrix, with one column per subject if `global = FALSE`.
`yscores`	a vector of y scores for all subjects.
`xscores`	a vector of x scores for all subjects.
`subject`	the variables specified by `subject`.
`treatment`	the variables specified by `treatment`.
`time`	the variables specified by `time`.
`ref`	the reference level of `treatment`.
`nonlinear.parameters`	a matrix of the estimated nonlinear parameters, with one column per subject.
`y`	a list giving the (smoothed) left-hand side matrix for each subject, after partialling out any covariates specified by `partial`.
`x`	a list giving the right-hand side matrix for each subject, after partialling out any covariates specified by `partial`.
`global.smooth`	the number of roots used in global smoothing.
`subject.smooth`	the number of roots used for subject-level smoothing.
`pct.explained`	the percentage of variance explained by the subject-level approximation.
`opt`	a list of output from `optim`, with one element per subject if `global = FALSE`.

Foteini Strimenopoulou and Heather Turner

Brain, P., Strimenopoulou, F. and Ivarsson, M. (2011). Analysing electroencephalogram (EEG) data using Extended Semi-Linear Canonical Correlation Analysis. Submitted.

readSpectra and bandSpectra for reading in spectra and aggregating over frequency bands.

plot.gslcca, varySmooth and gslcca-misc for functions to plot gslcca results, check sensitivity with regards to smoothing and access components of "gslcca" objects.

data(clonidine)

### Fit separate Double Exponential models for each rate,
### with amplitude varying by treatment
result <- gslcca(spectra, "Double Exponential",
    time = Time, subject = Rat, treatment = Treatment, 
    subject.smooth = TRUE, pct.explained = 0.96, 
    data = clonidine)

## drug signature
plot(result, "signature")

## projected values
plot(result, "fitted")

## projected values
plot(result, "scores")