# CRITERIA: Optimality Criteria In STPGA: Selection of Training Populations by Genetic Algorithm

## Description

These are some default design criteria to be minimized. There is a table in the details section that gives the formula for each design criterion and describes their usage. Note that the inputs for these functions come in 3 syntax flavors, namely Type-X, Type-D and Type-K. Users can define and use their owm design criteria as long as it has the Type-X syntax as shown with the examples.

## Usage

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 AOPT(Train, Test, P, lambda = 1e-05, C=NULL) CDMAX(Train, Test, P, lambda = 1e-05, C=NULL) CDMAX0(Train, Test, P, lambda = 1e-05, C=NULL) CDMAX2(Train, Test, P, lambda = 1e-05, C=NULL) CDMEAN(Train, Test, P, lambda = 1e-05, C=NULL) CDMEAN0(Train, Test, P, lambda = 1e-05, C=NULL) CDMEAN2(Train, Test, P, lambda = 1e-05, C=NULL) CDMEANMM(Train, Test, Kinv,K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL) DOPT(Train, Test, P, lambda = 1e-05, C=NULL) EOPT(Train, Test, P, lambda = 1e-05, C=NULL) GAUSSMEANMM(Train, Test, Kinv, K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL) GOPTPEV(Train, Test, P, lambda = 1e-05, C=NULL) GOPTPEV2(Train, Test, P, lambda = 1e-05, C=NULL) PEVMAX(Train, Test, P, lambda = 1e-05, C=NULL) PEVMAX0(Train, Test, P, lambda = 1e-05, C=NULL) PEVMAX2(Train, Test, P, lambda = 1e-05, C=NULL) PEVMEAN(Train, Test, P, lambda = 1e-05, C=NULL) PEVMEAN0(Train, Test, P, lambda = 1e-05, C=NULL) PEVMEAN2(Train, Test, P, lambda = 1e-05, C=NULL) PEVMEANMM(Train, Test, Kinv,K, lambda = 1e-05, C=NULL, Vg=NULL, Ve=NULL) dist_to_test(Train, Test, Dst, lambda, C) dist_to_test2(Train, Test, Dst, lambda, C) neg_dist_in_train(Train, Test, Dst, lambda, C) neg_dist_in_train2(Train, Test, Dst, lambda, C) 

## Arguments

 Train vector of identifiers for individuals in the training set Test vector of identifiers for individuals in the test set P (Only for Type-X) n \times k matrix of the first PCs of the predictor variables. The matrix needs to have union of the identifiers of the candidate and test individuals as rownames. Dst (Only for Type-D) n \times n symmetric distance matrix with row and column names. Kinv (Only for Type-K) n \times n symmetric matrix (inverse of the relationship matrix K between n individuals) with row and column names. K (Only for Type-K) n \times n symmetric matrix (the relationship matrix K between n individuals). lambda scalar shrinkage parameter (λ>0). C Contrast Matrix. Vg (Only for PEVMEANMM) covariance matrix between traits generated by the relationship K (multi-trait version). Ve (Only for PEVMEANMM) residual covariance matrix for the traits (multi-trait version).

## Details

 criterion name formula Type AOPT trace[C(P'_{Train}P_{Train}+lambda*I)^{-1}C'] X CDMAX max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/ X diag(CP_{Test}P'_{Test}C')] CDMAX0 max[diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')/ X diag(CP_{Train}P'_{Train}C')] CDMAX2 max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train} X (P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/diag(CP_{Test}P'_{Test}C')] CDMEAN mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/ X diag(CP_{Test}P'_{Test}C')] CDMEAN0 mean[diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')/ X diag(CP_{Train}P'_{Train}C')] CDMEAN2 mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train} X (P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')/diag(CP_{Test}P'_{Test}C')] CDMEANMM -mean[diag(CZ_{Test}(K-lambda*(Z_{Train}'MZ_{Train}+λ*Kinv)^{-1}Z_{Test}'C')/ K (diag(CZ_{Test}KZ_{Test}'C'))] DOPT logdet(C(P'_{Train}P_{Train}+lambda*I))^{-1}C' X EOPT max(eigenval(C(P'_{Train}P_{Train}+lambda*I))^{-1}C')) X GAUSSMEANMM -mean(diag(Z_{Test}KZ_{Test}'- K Z_{Test}KZ_{Train}'(Z_{Train}KZ_{Train}'+λ*I_{ntrain})^{-1}Z_{Train}KZ_{Test}') GOPTPEV max(eigenval(CP_{Test}(P_{Train}'P_{Train}+λ*I_{ntrain})^{-1}P_{Test}'C')) X GOPTPEV2 mean(eigenval(CP_{Test}(P_{Train}'P_{Train}+λ*I_{ntrain})^{-1}P_{Test}'C')) X PEVMAX max(diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')) X PEVMAX0 max(diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')) X PEVMAX2 max[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}P_{Train} X (P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C'] PEVMEAN mean(diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C')) X PEVMEAN0 mean(diag(CP_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Train}C')) X PEVMEAN2 mean[diag(CP_{Test}(P'_{Train}P_{Train}+lambda*I)^{-1} X P'_{Train}P_{Train}(P'_{Train}P_{Train}+lambda*I)^{-1}P'_{Test}C'] PEVMEANMM mean(diag(CZ_{test}(Ztrain'MZtrain+lambda*Kinv)^{-1}Ztest'C'))) K dist_to_test maximum distance from training set to test set based on Dst D dist_to_test2 mean distance from training set to test set based on Dst D neg_dist_in_train negative of minimum distance between pairs in the training set based on Dst D neg_dist_in_train2 negative of mean distance between distinct pairs in the training set based on Dst D

## Value

value of the criterion.

Deniz Akdemir

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35  ## Not run: #Examples to new criterion: #1- PEVmax STPGAUSERFUNC<-function(Train,Test, P, lambda=1e-6, C=NULL){ PTrain<-P[rownames(P)%in%Train,] PTest<-P[rownames(P)%in%Test,] if (length(Test)==1){PTest=matrix(PTest, nrow=1)} if (!is.null(C)){ PTest<-C%*%PTest} PEV<-PTest%*%solve(crossprod(PTrain)+lambda*diag(ncol(PTrain)),t(PTrain)) PEVmax<-max(diag(tcrossprod(PEV))) return(PEVmax) } ######Here is an example of usage data(iris) #We will try to estimate petal width from #variables sepal length and width and petal length. X<-as.matrix(iris[,1:4]) distX<-as.matrix(dist(X)) rownames(distX)<-colnames(distX)<-rownames(X)<-paste(iris[,5],rep(1:50,3),sep="_" ) #test data 25 iris plants selected at random from the virginica family, #candidates are the plants in the setosa and versicolor families. candidates<-rownames(X)[1:100] test<-sample(setdiff(rownames(X),candidates), 25) #want to select 25 examples using the criterion defined in STPGAUSERFUNC #Increase niterations and npop substantially for better convergence. ListTrain<-GenAlgForSubsetSelection(P=distX,Candidates=candidates, Test=test,ntoselect=25,npop=50, nelite=5, mutprob=.8, niterations=30, lambda=1e-5, errorstat="STPGAUSERFUNC", plotiters=TRUE) ## End(Not run) 

