Perform a Generalised Semi Linear Canonical Correlation Analysis

Share:

Description

Performs (Generalised) Semi Linear Canonical Correlation Analysis, i.e. computes the canonical correlation between a data matrix and a nonlinear function of time. SLCCA is extended by allowing parameters to vary by a treatment factor and allowing adjustment for covariates. Shortcuts are provided for PK/PD models suitable for analysing EEG data.

Usage

1
2
3
4
5
gslcca(Y, formula = "Double Exponential", time, subject = NULL, global = FALSE,
    treatment = NULL, ref = 1, separate = FALSE, partial = ~1, data = NULL, 
    subset = NULL, global.smooth = FALSE, subject.smooth = TRUE, 
    pct.explained = 0.96, start = NULL, method = "L-BFGS-B", lower = 2,
    upper = 15, ...)

Arguments

Y

a data matrix.

formula

either a nonlinear model formula specifying a function of time including parameters, or one of the character strings "Double Exponential" or "Critical Exponential" to specify the corresponding PK/PD model (see Details).

time

a vector of time values corresponding to the rows of Y.

subject

an optional factor grouping the rows of Y. If specified, a separate set of coefficients of Y will be estimated for each subject.

global

if subject is specified, whether to fit a global model for all subjects, i.e. whether the nonlinear model and associated coefficients should be the same for all subjects.

treatment

an optional factor (nested within subject if specified) identifying groups for which separate parameters should be estimated.

ref

the reference level of treatment for which formula is set to ~ 0 (may be NULL).

separate

if treatment is specified, whether to estimate separate parameters of formula and separate canonical coefficients for each treatment or just separate canonical coefficients.

partial

a linear model formula specifying covariates to partial out of the CCA analysis (may be NULL).

data

an optional data frame in which to evaluate the variables in time, subject, treatment and partial.

subset

an optional logical or numeric vector specifying a subset of observations to be used in the fitting process.

global.smooth

controls the smoothing of Y via SVD. Must be one of FALSE for no smoothing, TRUE for smoothing with an automatically selected number of roots, or a scalar specifying the number of roots to use.

subject.smooth

controls the smoothing of Y within the levels of subject. Accepts the same values as global.smooth.

pct.explained

a scalar between 0 and 1 indicating the desired minimum percentage of variance explained by the SVD approximation when subject.smooth or global.smooth is TRUE.

start

a named list of starting values for the parameters. Each element must have the same length as the number of non-reference levels of treatment or have length one, in which case the same starting values are used for each non-reference treatment level. May be NULL when formula is a character string, however it is recommended that starting values are always provided.

method

the method to be used by optim in finding the nonlinear parameters.

lower, upper

bounds on the nonlinear parameters.

...

arguments passed on to optim.

Details

The function fits the following model:

Y(t) * A = X(t, theta) * B

where Y(t) is a data matrix with rows of observations recorded at times t, A is a vector of loadings, X(t, theta) is a matrix with columns containing a nonlinear function with unknown parameters for each non-reference treatment level, and B is a vector of coefficients.

The parameters A, theta and B are estimated to optimise the correlation between the left- and right-hand sides of the model.

If partial specifies a matrix of covariates, G, to be partialled out, then the canonical correlation analysis is based on the residuals from the multivariate linear models lm(Y ~ 0 + G) and lm(X ~ 0 + G) When partial = ~1 this is equivalent to centering the columns of Y and X.

The nonlinear function defining X may be specified by a character string for two pharmacokinetic/pharmacodynamic models:

"Double Exponential"

~ exp(-time/exp(K1))-exp(-time/exp(K2))

"Critical Exponential"

~ time*exp(-time/exp(K1))

The data matrix Y may be pre-smoothed to remove global artefacts. This is achieved by approximating the data from the number of SVD roots specified by global.smooth. If the number of roots is not specified explicitly, k roots are selected such that lambda_k/lambda_1 >0.001 and (sum_{j=1}^k lambda_j)/(sum_{j=1}^r lambda_j) >= p where k = 1, ..., r, r is the rank of Y, lambda_1, lambda_2, ... are the eigenvalues of the variance matrix and p is given by pct.explained.

The data are then analysed for each level of subject separately and the data may again be smoothed at this level to remove local artefacts as specified by local.smooth.

The coefficients of the left-hand side matrix are sometimes referred to as loadings and are described in Brain et al (2011) as signatures.

The projected values Y(t) * A and X(t, theta) * B are known as the y scores and x scores respectively. In this context, they can be thought of as the (projected) observed and fitted values.

Signatures, observed and fitted values can be visualised using plot.gslcca.

Value

An object of class "gslcca", which is a list with components

call

the call to gslcca.

ycoef

a matrix of the coefficients of the left-hand side matrix, with one column per subject.

xcoef

a matrix of the coefficients of the left-hand side matrix, with one column per subject if global = FALSE.

yscores

a vector of y scores for all subjects.

xscores

a vector of x scores for all subjects.

subject

the variables specified by subject.

treatment

the variables specified by treatment.

time

the variables specified by time.

ref

the reference level of treatment.

nonlinear.parameters

a matrix of the estimated nonlinear parameters, with one column per subject.

y

a list giving the (smoothed) left-hand side matrix for each subject, after partialling out any covariates specified by partial.

x

a list giving the right-hand side matrix for each subject, after partialling out any covariates specified by partial.

global.smooth

the number of roots used in global smoothing.

subject.smooth

the number of roots used for subject-level smoothing.

pct.explained

the percentage of variance explained by the subject-level approximation.

opt

a list of output from optim, with one element per subject if global = FALSE.

Author(s)

Foteini Strimenopoulou and Heather Turner

References

Brain, P., Strimenopoulou, F. and Ivarsson, M. (2011). Analysing electroencephalogram (EEG) data using Extended Semi-Linear Canonical Correlation Analysis. Submitted.

See Also

readSpectra and bandSpectra for reading in spectra and aggregating over frequency bands.

plot.gslcca, varySmooth and gslcca-misc for functions to plot gslcca results, check sensitivity with regards to smoothing and access components of "gslcca" objects.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
data(clonidine)

### Fit separate Double Exponential models for each rate,
### with amplitude varying by treatment
result <- gslcca(spectra, "Double Exponential",
    time = Time, subject = Rat, treatment = Treatment, 
    subject.smooth = TRUE, pct.explained = 0.96, 
    data = clonidine)

## drug signature
plot(result, "signature")

## projected values
plot(result, "fitted")

## projected values
plot(result, "scores")