CorReg-package: Quick tutorial for CorReg package

Description Details Author(s) References Examples

Description

Sequential linear regression based on a structural equation model(explicit correlations). It permits to face highly correlated datasets. We first search for an explicit model of correlations within the covariates by linear regression, then this structure is interpreted and used to reduce dimension and correlations for the main regression on the response variable.

Details

CorReg: see www.correg.org for article and Phd Thesis about CorReg.

Author(s)

Maintainer: Clement THERY <clement.thery@arcelormittal.com>

References

Model-based covariable decorrelation in linear regression (CorReg): application to missing data and to steel industry. C Thery - 2015. see http://www.theses.fr/2015LIL10060 to read the associated PhD Thesis.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
   ## Not run: 
require(CorReg)
   #dataset generation
   base=mixture_generator(n=15,p=10,ratio=0.4,tp1=1,tp2=1,tp3=1,positive=0.5,
                          R2Y=0.8,R2=0.9,scale=TRUE,max_compl=3,lambda=1)
   X_appr=base$X_appr #learning sample
   Y_appr=base$Y_appr #response variable for the learning sample
   Y_test=base$Y_test #responsee variable for the validation sample
   X_test=base$X_test #validation sample
   
   TrueZ=base$Z#True generative structure (binary adjacency matrix)
   #Z_i,j=1 means that Xj linearly depends on Xi
   
   #density estimation for the MCMC (with Gaussian Mixtures)
   density=density_estimation(X=X_appr,nbclustmax=10,detailed=TRUE)
   Bic_null_vect=density$BIC_vect# vector of the BIC found (1 value per covariate)
   
   #MCMC to find the structure
   res=structureFinder(X=X_appr,verbose=0,reject=0,Maxiter=900,
               nbini=20,candidates=-1,Bic_null_vect=Bic_null_vect,star=TRUE,p1max=15,clean=TRUE)
   hatZ=res$Z_opt #found structure (adjacency matrix)
   hatBic=res$bic_opt #associated BIC
   
   #BIC comparison between true and found structure
   bicopt_vect=BicZ(X=X_appr,Z=hatZ,Bic_null_vect=Bic_null_vect)
   bicopt_vrai=BicZ(X=X_appr,Z=TrueZ,Bic_null_vect=Bic_null_vect)
   sum(bicopt_vect);sum(bicopt_vrai)
   
   #Structure comparison
   compZ=compare_struct(trueZ=TrueZ,Zalgo=hatZ)#qualitative comparison
   
   #interpretation of found and true structure ordered by increasing R2
   readZ(Z=hatZ,crit="R2",X=X_appr,output="all",order=1)# <NA>line : name of subregressed covariate
   readZ(Z=TrueZ,crit="R2",X=X_appr,output="all",order=1)# <NA>line : name of subregressed covariate
   
   #Regression coefficients estimation
    select="NULL"#without variable selection (otherwise, choose "lar" for example)
   resY=correg(X=X_appr,Y=Y_appr,Z=hatZ,compl=TRUE,expl=TRUE,pred=TRUE,
               select=select,K=10)
   
   #MSE computation
   MSE_complete=MSE_loc(Y=Y_test,X=X_test,A=resY$compl$A)#classical model on X
   MSE_marginal=MSE_loc(Y=Y_test,X=X_test,A=resY$expl$A)#reduced model without correlations
   MSE_plugin=MSE_loc(Y=Y_test,X=X_test,A=resY$pred$A)#plug-in model
   MSE_true=MSE_loc(Y=Y_test,X=X_test,A=base$A)# True model
   
   
   #MSE comparison
   MSE=data.frame(MSE_complete,MSE_marginal,MSE_plugin,MSE_true)
   MSE#estimated structure
   compZ$true_left;compZ$false_left
  barplot(as.matrix(MSE),main="MSE on validation dataset", sub=paste("select=",select))
  abline(h=MSE_complete,col="red")
   
## End(Not run)

Example output

                                                                                                                    
                                                                                                                    
          CCCCCCCCCCCCC                                         RRRRRRRRRRRRRRRRR                                         
        CCC::::::::::::C                                        R::::::::::::::::R                                          
       CC:::::::::::::::C                                       R::::::RRRRRR:::::R                                         
      C:::::CCCCCCCC::::C                                       RR:::::R     R:::::R                                        
      C:::::C       CCCCCC   ooooooooooo      rrrrr   rrrrrrrrr   R::::R     R:::::R     eeeeeeeeeeee       ggggggggg   ggggg
      C:::::C               oo:::::::::::oo   r::::rrr:::::::::r  R::::R     R:::::R   ee::::::::::::ee    g:::::::::ggg::::g
      C:::::C              o:::::::::::::::o  r:::::::::::::::::r R::::RRRRRR:::::R   e::::::eeeee:::::ee g:::::::::::::::::g
      C:::::C              o:::::ooooo:::::or r::::::rrrrr::::::r R:::::::::::::RR   e::::::e     e:::::eg::::::ggggg::::::gg
      C:::::C              o::::o     o::::o  r:::::r     r:::::r R::::RRRRRR:::::R  e:::::::eeeee::::::eg:::::g     g:::::g 
      C:::::C              o::::o     o::::o  r:::::r     rrrrrrr R::::R     R:::::R e:::::::::::::::::e g:::::g     g:::::g 
      C:::::C              o::::o     o::::o  r:::::r             R::::R     R:::::R e::::::eeeeeeeeeee  g:::::g     g:::::g 
      C:::::C       CCCCCC o::::o     o::::o  r:::::r             R::::R     R:::::R e:::::::e           g::::::g    g:::::g 
      C:::::CCCCCCCC::::C  o:::::ooooo:::::o  r:::::r           RR:::::R     R:::::R e::::::::e          g:::::::ggggg:::::g 
       CC:::::::::::::::C  o:::::::::::::::o  r:::::r           R::::::R     R:::::R  e::::::::eeeeeeee   g::::::::::::::::g 
        CCC::::::::::::C   oo:::::::::::oo    r:::::r           R::::::R     R:::::R   ee:::::::::::::e    gg::::::::::::::g 
          CCCCCCCCCCCCC      ooooooooooo      rrrrrrr           RRRRRRRR     RRRRRRR     eeeeeeeeeeeeee      gggggggg::::::g 
                                                                                                                     g:::::g 
                                                                                                         gggggg      g:::::g 
                                                                                                         g:::::gg   gg:::::g 
                                                                                                         g::::::ggg:::::::g 
                                                                                                          gg:::::::::::::g  
                                                                                                             ggg::::::ggg    
                                                                                                                gggggg       
          
 The Concept, the Method, the Power
[1] 304.0965
[1] 309.8015
[[1]]
                coefs       var
1   0.860796064728701        R2
2                <NA>         4
3     0.6689332903619 intercept
4  -0.110153788136904         1
5 -0.0701316998801885         2
6 -0.0947649813258923         3
7   0.546137123454117         5
8  -0.306034391822208         6
9   0.499799163689978        10

[[2]]
                coefs       var
1   0.906694566324102        R2
2                <NA>         8
3   0.662721471933004 intercept
4  -0.350154869623452         1
5  -0.609547545858945         2
6   0.303872429803662         3
7   0.122547651878245         5
8 -0.0907174627644957         6
9 -0.0852283244538394        10

[[3]]
               coefs       var
1  0.947612765244848        R2
2               <NA>         7
3 -0.834996674689482 intercept
4   0.28704785644588         2
5  0.130256671379444         3
6 -0.492989687235268         6
7  0.189741588018843        10

[[4]]
                coefs       var
1   0.968872850107611        R2
2                <NA>         9
3  -0.904092630593816 intercept
4  0.0350337710754986         1
5  0.0296776573723411         2
6 -0.0935515819021165         3
7   0.237736279084439         5
8  -0.228564405596821         6

[[1]]
               coefs       var
1  0.849541328696498        R2
2               <NA>         4
3  0.687068057281914 intercept
4  0.489960095634011         5
5 -0.320208553629766         6
6   0.46618613829287        10

[[2]]
               coefs       var
1   0.88690253151389        R2
2               <NA>         8
3  0.652080922923104 intercept
4 -0.333852189881848         1
5 -0.587227722879119         2
6  0.307561471711988         3

[[3]]
               coefs       var
1  0.912457752498476        R2
2               <NA>         7
3 -0.849879599145825 intercept
4  0.304525304143961         2
5 -0.477494586113415         6
6  0.199889055208016        10

[[4]]
               coefs       var
1  0.952057272825782        R2
2               <NA>         9
3 -0.903578686360595 intercept
4 -0.100809529628965         3
5  0.252299016357404         5
6 -0.229812761317557         6

  MSE_complete MSE_marginal MSE_plugin MSE_true
1     628.4427     404.3436   612.0733 267.9941
[1] 4
[1] 0

CorReg documentation built on Sept. 6, 2019, 3 a.m.