# als: alternating least squares multivariate curve resolution... In ALS: Multivariate Curve Resolution Alternating Least Squares (MCR-ALS)

## Description

This is an implementation of alternating least squares multivariate curve resolution (MCR-ALS). Given a dataset in matrix form `d1`, the dataset is decomposed as `d1=C %*% t(S)` where the columns of `C` and `S` represent components contributing to the data in each of the 2-ways that the matrix is resolved. In forming the decomposition, the components in each way many be constrained with e.g., non-negativity, uni-modality, selectivity, normalization of `S` and closure of `C`. Note that if more than one dataset is to be analyzed simultaneously, then the matrix `S` is assumed to be the same for every dataset in the bilinear decomposition of each dataset into matrices `C` and `S`.

## Usage

 ```1 2 3 4 5 6``` ```als(CList, PsiList, S=matrix(), WList=list(), thresh =.001, maxiter=100, forcemaxiter = FALSE, optS1st=TRUE, x=1:nrow(CList[[1]]), x2=1:nrow(S), baseline=FALSE, fixed=vector("list", length(PsiList)), uniC=FALSE, uniS=FALSE, nonnegC = TRUE, nonnegS = TRUE, normS=0, closureC=list()) ```

## Arguments

 `CList` list with the same length as `PsiList` where each element is a matrix of dimension `m` by `comp` and represents the matrix `C` for each dataset `PsiList` list of datasets, where each dataset is a matrix of dimension `m` by `n` `S` matrix with `n` rows and `comp` columns, often representing (mass) spectra `WList` An optional list with the same length as `PsiList`, where each element is a matrix of dimension `m` by `n` giving the weight of that datapoint; note that if closure or normalization constraints are applied, then both are applied after the application of weights. `thresh` numeric value that defaults to .001; if `((oldrss - rss) / oldrss) < thresh` then the optimization stops, where `oldrss` is the residual sum of squares at iteration `x-1` and `rss` is the residual sum of squares at iteration `x` `maxiter` The maximum number of iterations to perform (where an iteration is optimization of either `AList` and `C`) `forcemaxiter` Logical indicating whether `maxiter` iterations should be performed even if the residual difference drops below `thresh`. `optS1st` logical indicating whether the first constrained least squares regression should estimate `S` or `CList`. `x` optional vector of labels for the rows of `C`, which are used in the application of unimodality constraints. `x2` optional vector of labels for the rows of `S`, which are used in the application of unimodality constraints. `baseline` logical indicating whether a baseline component is present; if `baseline=TRUE` then this component is exempt from constraints unimodality or non-negativity `fixed` list with the same length as `PsiList` in which each element is a vector of the indices of the components to fix to zero in each dataset `nonnegS` logical indicating whether the components (columns) of the matrix `S` should be constrained to non-negative values `nonnegC` logical indicating whether the components (columns) of the matrix `C` should be constrained to non-negative values `uniC` logical indicating whether unimodality constraints should be applied to the columns of `C` `uniS` logical indicating whether unimodality constraints should be applied to the columns of `S` `normS` numeric indicating whether the spectra are normalized; if `normS>0`, the spectra are normalized. If `normS==1` the maximum of the spectrum of each component is constrained to be equal to one; if `normS > 0 && normS!=1` then the norm of the spectrum of each component is constrained to be equal to one. `closureC` list; if the length is zero, then no closure constraints are applied. If the length is not zero, it should be equal to the number of datasets in the analysis, and contain numeric vectors consisting of the desired value of the sum of each row of the concentration matrix.

## Value

A list with components:

 `CList` A list with the same length as the number of datasets, containing the optimized matrix `C` at termination scaled by the optimized amplitudes for that dataset from `AList`. `S` The matrix `S` given as input. `rss` The residual sum of squares at termination. `resid` A list with the same length as the number of datasets, containing the residual matrix for each dataset `iter` The number of iterations performed before termination.

## Note

This function was used to solve problems described in

van Stokkum IHM, Mullen KM, Mihaleva VV. Global analysis of multiple gas chromatography-mass spectrometry (GS/MS) data sets: A method for resolution of co-eluting components with comparison to MCR-ALS. Chemometrics and Intelligent Laboratory Systems 2009; 95(2): 150-163.

in conjunction with the package TIMP. For the code to reproduce the examples in this paper, see examples_chemo.zip included in the `inst` directory of the package source code. .

## References

Garrido M, Rius FX, Larrechi MS. Multivariate curve resolution alternating least squares (MCR-ALS) applied to spectroscopic data from monitoring chemical reactions processes. Journal Analytical and Bioanalytical Chemistry 2008; 390:2059-2066.

Jonsson P, Johansson A, Gullberg J, Trygg J, A J, Grung B, Marklund S, Sjostrom M, Antti H, Moritz T. High-throughput data analysis for detecting and identifying differences between samples in GC/MS-based metabolomic analyses. Analytical Chemistry 2005; 77:5635-5642.

Tauler R. Multivariate curve resolution applied to second order data. Chemometrics and Intelligent Laboratory Systems 1995; 30:133-146.

Tauler R, Smilde A, Kowalski B. Selectivity, local rank, three-way data analysis and ambiguity in multivariate curve resolution. Journal of Chemometrics 1995; 9:31-58.

`matchFactor`,`multiex`,`multiex1`, `plotS`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31``` ```## load 2 matrix datasets into variables d1 and d2 ## load starting values for elution profiles ## into variables Cstart1 and Cstart2 ## load time labels as x, m/z values as x2 data(multiex) ## starting values for elution profiles matplot(x,Cstart1,type="l") matplot(x,Cstart2,type="l",add=TRUE) ## using MCR-ALS, improve estimates for mass spectra S and the two ## matrices of elution profiles ## apply unimodality constraints to the elution profile estimates ## note that the starting estimates for S just contain a dummy matrix test0 <- als(CList=list(Cstart1,Cstart2),S=matrix(1,nrow=400,ncol=2), PsiList=list(d1,d2), x=x, x2=x2, uniC=TRUE, normS=0) ## plot the estimated mass spectra plotS(test0\$S,x2) ## the known mass spectra are contained in the variable S ## can compare the matching factor of each estimated spectrum to ## that in S matchFactor(S[,1],test0\$S[,1]) matchFactor(S[,2],test0\$S[,2]) ## plot the estimated elution profiles ## this shows the relative abundance of the 2nd component is low matplot(x,test0\$CList[[1]],type="l") matplot(x,test0\$CList[[2]],type="l",add=TRUE) ```

### Example output

```Loading required package: nnls
Iso 0.0-17
Iteration (opt. S): 1, RSS: 1.330703e+12, RD: 0.9562264
Iteration (opt. C): 2, RSS: 153488187, RD: 0.9998847
Iteration (opt. S): 3, RSS: 102433454, RD: 0.3326297
Iteration (opt. C): 4, RSS: 102351694, RD: 0.0007981757