# da-methods: Discriminant Analysis of Interval Data

### Description

lda and qda perform linear and quadratic discriminant analysis of Interval Data based on classic estimates of a mixture of Gaussian models. Roblda and Robqda do the same using robust estimates of location and scatter. snda performs discriminant analysis of Interval Data based on estimates of mixtures of Skew-Normal models

### Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66``` ```## S4 method for signature 'IData' lda( x, grouping, prior="proportions", CVtol=1.0e-5, egvtol=1.0e-10, subset=1:nrow(x), CovCase=1:4, SelCrit=c("BIC","AIC"), silent=FALSE, ... ) ## S4 method for signature 'IdtMxtNDE' lda(x, prior="proportions", selmodel=BestModel(x), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IdtClMANOVA' lda( x, prior="proportions", selmodel=BestModel(H1res(x)), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IdtClMANOVA' lda( x, prior="proportions", selmodel=BestModel(H1res(x)), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IdtLocNSNMANOVA' lda( x, prior="proportions", selmodel=BestModel(H1res(x)@NMod), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IData' qda( x, grouping, prior="proportions", CVtol=1.0e-5, subset=1:nrow(x), CovCase=1:4, SelCrit=c("BIC","AIC"), silent=FALSE, ... ) ## S4 method for signature 'IdtMxtNDE' qda(x, prior="proportions", selmodel=BestModel(x), silent=FALSE, ... ) ## S4 method for signature 'IdtHetNMANOVA' qda( x, prior="proportions", selmodel=BestModel(H1res(x)), silent=FALSE, ... ) ## S4 method for signature 'IdtGenNSNMANOVA' qda( x, prior="proportions", selmodel=BestModel(H1res(x)@NMod), silent=FALSE, ... ) ## S4 method for signature 'IData' Roblda( x, grouping, prior="proportions", CVtol=1.0e-5, egvtol=1.0e-10, subset=1:nrow(x), CovCase=1:4, SelCrit=c("BIC","AIC"), silent=FALSE, CovEstMet=c("Pooled","Globdev"), SngDMet=c("fasttle","fulltle"), Robcontrol=RobEstControl(), ... ) ## S4 method for signature 'IData' Robqda( x, grouping, prior="proportions", CVtol=1.0e-5, subset=1:nrow(x), CovCase=1:4, SelCrit=c("BIC","AIC"), silent=FALSE, SngDMet=c("fasttle","fulltle"), Robcontrol=RobEstControl(), ... ) ## S4 method for signature 'IData' snda(x, grouping, prior="proportions", CVtol=1.0e-5, subset=1:nrow(x), CovCase=1:4, SelCrit=c("BIC","AIC"), Mxt=c("Loc","Gen"), ... ) ## S4 method for signature 'IdtLocSNMANOVA' snda( x, prior="proportions", selmodel=BestModel(H1res(x)), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IdtLocNSNMANOVA' snda( x, prior="proportions", selmodel=BestModel(H1res(x)@SNMod), egvtol=1.0e-10, silent=FALSE, ... ) ## S4 method for signature 'IdtGenSNMANOVA' snda( x, prior="proportions", selmodel=BestModel(H1res(x)), silent=FALSE, ... ) ## S4 method for signature 'IdtGenNSNMANOVA' snda( x, prior="proportions", selmodel=BestModel(H1res(x)@SNMod), silent=FALSE, ... ) ```

### Arguments

 `x` An object of class `IData`, `IdtLocSNMANOVA`, `IdtLocNSNMANOVA`,`IdtGenSNMANOVA` or `IdtGenNSNMANOVA` with either the original Interval Data, or the results of a Interval Data Skew-Normal MANOVA, from which the discriminant analysis will be based. `grouping` Factor specifying the class for each observation. `prior` The prior probabilities of class membership. If unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels. `CVtol` Tolerance level for absolute value of the coefficient of variation of non-constant variables. When a MidPoint or LogRange has an absolute value within-groups coefficient of variation below CVtol, it is considered to be a constant. `subset` An index vector specifying the cases to be used in the analysis. `CovCase` Configuration of the variance-covariance matrix: a set of integers between 1 and 4. `SelCrit` The model selection criterion. `silent` A boolean flag indicating wether a warning message should be printed if the method fails. `CovEstMet` Method used to estimate the common covariance matrix in `Roblda` (Robust linear discriminant analysis). Alternatives are “Pooled” (default) for a pooled average of the the robust within-groups covariance estimates, and “Globdev” for a global estimate based on all deviations from the groups multivariate l1 medians. See Todorov and Filzmoser (2009) and `pcaPP.l1median` for details. `SngDMet` Algorithm used to find the robust estimates of location and scatter. Alternatives are “fasttle” (default) and “fulltle”. `Robcontrol` A control object (S4) of class `RobEstControl-class` containing estimation options - same as these provided in the function specification. If the control object is supplied, the parameters from it will be used. If parameters are passed also in the invocation statement, they will override the corresponding elements of the control object. `Mxt` Indicates the type of mixing distributions to be considered. Current alternatives are “Hom” (homocedastic) and “Het” (hetereocedasic) for Gaussian models, “Loc” (location model – groups differ only on their location parameters) and “Gen” “Loc” (general model – groups differ on all parameters) for Skew-Normal models. `selmodel` Selected model from a list of candidate models saved in object x. `egvtol` Tolerance level for the eigenvalues of the product of the inverse within by the between covariance matrices. When a eigenvalue has an absolute value below egvtol, it is considered to be zero. `...` Other named arguments.

### References

Duarte Silva, A.P. and Brito, P. (2015), Discriminant analysis of interval data: An assessment of parametric and distance-based approaches. Journal of Classification 39(3), 516–541.

Todorov V. and Filzmoser P. (2009), An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software 32(3), 1–47.

`IData`, `IdtLocSNMANOVA`,`IdtLocSNMANOVA`,`IdtLocSNMANOVA`,`IdtLocSNMANOVA`, `pcaPP.l1median`.

### Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72``` ```# Create an Interval-Data object containing the intervals for 899 observations # on the temperatures by quarter in 60 Chinese meteorological stations. ChinaT <- IData(ChinaTemp[1:8],VarNames=c("T1","T2","T3","T4")) #Linear Discriminant Analysis ChinaT.lda <- lda(ChinaT,ChinaTemp\$GeoReg) cat("Temperatures of China -- linear discriminant analysis results:\n") print(ChinaT.lda) cat("lda Prediction results:\n") print(predict(ChinaT.lda,ChinaT)\$class) ## Not run: #Estimate error rates by ten-fold cross-validation replicated 20 times CVlda <- DACrossVal(ChinaT,ChinaTemp\$GeoReg,TrainAlg=lda,CovCase=CovCase(ChinaT.lda)) summary(CVlda[,,"Clerr"]) glberrors <- apply(CVlda[,,"Nk"]*CVlda[,,"Clerr"],1,sum)/apply(CVlda[,,"Nk"],1,sum) cat("Average global classification error =",mean(glberrors),"\n") ## End(Not run) #Quadratic Discriminant Analysis ChinaT.qda <- qda(ChinaT,ChinaTemp\$GeoReg) cat("Temperatures of China -- qda discriminant analysis results:\n") print(ChinaT.qda) ## Not run: #Estimate error rates by ten-fold cross-validation replicated 20 times CVqda <- DACrossVal(ChinaT,ChinaTemp\$GeoReg,TrainAlg=qda,CovCase=CovCase(ChinaT.qda)) summary(CVqda[,,"Clerr"]) glberrors <- apply(CVqda[,,"Nk"]*CVqda[,,"Clerr"],1,sum)/apply(CVqda[,,"Nk"],1,sum) cat("Average global classification error =",mean(glberrors),"\n") # Skew-Normal based discriminant analysis, asssuming that the different regions may differ # in all SkewNormal parameters cat("Temperatures of China -- SkewNormal general model discriminant analysis results:\n") ChinaT.gensnda <- snda(ChinaT,ChinaTemp\$GeoReg,Mxt="Gen") print(ChinaT.gensnda) #Estimate error rates by three-fold cross-validation without replication CVgensnda <- DACrossVal(ChinaT,ChinaTemp\$GeoReg,TrainAlg=snda,Mxt="Gen", CovCase=CovCase(ChinaT.gensnda),kfold=3,CVrep=1) summary(CVgensnda[,,"Clerr"]) glberrors <- apply(CVgensnda[,,"Nk"]*CVgensnda[,,"Clerr"],1,sum)/apply(CVgensnda[,,"Nk"],1,sum) cat("Average global classification error =",mean(glberrors),"\n") #Robust Quadratic Discriminant Analysis ChinaT.rqda <- Robqda(ChinaT,ChinaTemp\$GeoReg) cat("Temperatures of China -- robust qda discriminant analysis results:\n") print(ChinaT.rqda) #Estimate error rates by ten-fold cross-validation with 5 replications CVrqda <- DACrossVal(ChinaT,ChinaTemp\$GeoReg,TrainAlg=Robqda,CovCase=CovCase(ChinaT.rqda), CVrep=5) summary(CVrqda[,,"Clerr"]) glberrors <- apply(CVrqda[,,"Nk"]*CVrqda[,,"Clerr"],1,sum)/apply(CVrqda[,,"Nk"],1,sum) cat("Average global classification error =",mean(glberrors),"\n") ## End(Not run) ```

Search within the MAINT.Data package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.