lessSEM can be used for regularized SEM and for general purpose optimization. That is, you can use all optimizers and penalty functions implemented in lessSEM for your own models. To this end, you must define a fitting function; i.e., a function which takes in the parameters and returns a single value - the unregularized fit. lessSEM uses this fitting function and adds the penalty terms. The combined fitting function is then optimized. Currently, there are four ways to use the optimizers in lessSEM
?lessSEM::gpLasso
). ?lessSEM::gpLassoCpp
).In general, the approaches get faster as you transition from 1 to 4. You will see the largest performance gains when implementing a gradient function and not just a fitting function, however. As a rule of thumb: Use approach 1 if you intend to run your model a few times, don't want to create a new package and your model runs fairly fast. Use approach 2 if you want to increase the speed a bit, while keeping the changes necessary to your files manageable. Use approach 3 or 4 if you create a new package, have some experience with RcppArmadillo and want to get the best performance.
In the following, we will demonstrate all three approaches using a linear regression model as an example.
Let's start by setting up our linear regression model. To this end, we will simulate a data set:
set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2)
We will now try to implement a lasso regularized linear regression using the
gpLasso interface. This interface is very similar to optim
. To use it,
we must define our fitting function in R:
# defining the sum-squared-errors: sseFun <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) }
Additionally, we need a labeled vector with starting values:
par <- rep(0, p+1) names(par) <- paste0("b", 0:p) print(par) #> b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 #> 0 0 0 0 0 0 0 0 0 0 0
Note that we defined one more parameter than there are variables in X. This is because we also want to estimate the intercept. To this end, we extend X:
Xext <- cbind(1,X) head(Xext) #> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] #> [1,] 1 -0.56047565 -0.71040656 2.1988103 -0.7152422 -0.07355602 -0.60189285 1.07401226 -0.7282191 0.3562833 -1.0141142 #> [2,] 1 -0.23017749 0.25688371 1.3124130 -0.7526890 -1.16865142 -0.99369859 -0.02734697 -1.5404424 -0.6580102 -0.7913139 #> [3,] 1 1.55870831 -0.24669188 -0.2651451 -0.9385387 -0.63474826 1.02678506 -0.03333034 -0.6930946 0.8552022 0.2995937 #> [4,] 1 0.07050839 -0.34754260 0.5431941 -1.0525133 -0.02884155 0.75106130 -1.51606762 0.1188494 1.1529362 1.6390519 #> [5,] 1 0.12928774 -0.95161857 -0.4143399 -0.4371595 0.67069597 -1.50916654 0.79038534 -1.3647095 0.2762746 1.0846170 #> [6,] 1 1.71506499 -0.04502772 -0.4762469 0.3311792 -1.65054654 -0.09514745 -0.21073418 0.5899827 0.1441047 -0.6245675
Finally, we need to decide which parameters should be regularized and the values for lambda. We want to regularize everything except for the intercept:
(regularized <- paste0("b", 1:p)) #> [1] "b1" "b2" "b3" "b4" "b5" "b6" "b7" "b8" "b9" "b10" lambdas <- seq(0,.1,length.out = 20)
Now, we are ready to estimate the model:
library(lessSEM) l1 <- gpLasso(par = par, regularized = regularized, fn = sseFun, lambdas = lambdas, X = Xext, y = y, N = length(y) ) head(l1@parameters)
#> lambda alpha theta b0 b1 b2 b3 b4 b5 b6 b7 b8 #> 1 0.000000000 1 0 0.02738472 1.0129194 0.9991454 0.9705725 1.027626 0.014036009 -0.007460964 0.0185899238 0.021930771 #> 2 0.005263158 1 0 0.02935302 1.0043737 0.9908934 0.9626258 1.025138 0.003365832 0.000000000 0.0143411319 0.015434822 #> 3 0.010526316 1 0 0.02995132 0.9967095 0.9846674 0.9552789 1.021891 0.000000000 0.000000000 0.0096220740 0.010707768 #> 4 0.015789474 1 0 0.03010607 0.9897339 0.9789423 0.9481496 1.018672 0.000000000 0.000000000 0.0049333012 0.006364327 #> 5 0.021052632 1 0 0.03029739 0.9827288 0.9732058 0.9409861 1.015363 0.000000000 0.000000000 0.0001782005 0.002036124 #> 6 0.026315789 1 0 0.03112551 0.9753354 0.9670622 0.9338614 1.011552 0.000000000 0.000000000 0.0000000000 0.000000000 #> b9 b10 #> 1 -0.009900077 0.027401044 #> 2 -0.007939443 0.022297575 #> 3 -0.005256845 0.017465446 #> 4 -0.002392921 0.012713123 #> 5 0.000000000 0.007969754 #> 6 0.000000000 0.003304080
Note that we did not specify the gradients of our function. In this case, lessSEM will use numDeriv to compute the gradients. However, if you know how to specify the gradients, this can result in faster estimation:
sseGrad <- function(par, y, X, N){ gradients = (-2.0*t(X) %*% y + 2.0*t(X)%*%X%*%matrix(par,ncol = 1)) gradients = (.5/length(y))*gradients return(t(gradients)) }
l1 <- gpLasso(par = par, regularized = regularized, fn = sseFun, gr = sseGrad, lambdas = lambdas, X = Xext, y = y, N = length(y) ) head(l1@parameters)
#> lambda alpha theta b0 b1 b2 b3 b4 b5 b6 b7 b8 #> 1 0.000000000 1 0 0.02738485 1.0129200 0.9991452 0.9705725 1.027626 0.014034994 -0.007460252 0.0185901898 0.021930702 #> 2 0.005263158 1 0 0.02935325 1.0043732 0.9908928 0.9626258 1.025139 0.003364951 0.000000000 0.0143418480 0.015434447 #> 3 0.010526316 1 0 0.02995023 0.9967094 0.9846669 0.9552792 1.021892 0.000000000 0.000000000 0.0096217574 0.010707383 #> 4 0.015789474 1 0 0.03010649 0.9897330 0.9789426 0.9481493 1.018672 0.000000000 0.000000000 0.0049332838 0.006363868 #> 5 0.021052632 1 0 0.03029729 0.9827286 0.9732062 0.9409869 1.015362 0.000000000 0.000000000 0.0001772169 0.002036300 #> 6 0.026315789 1 0 0.03112481 0.9753368 0.9670620 0.9338616 1.011553 0.000000000 0.000000000 0.0000000000 0.000000000 #> b9 b10 #> 1 -0.009900699 0.027400748 #> 2 -0.007939640 0.022297136 #> 3 -0.005257048 0.017465134 #> 4 -0.002393522 0.012713116 #> 5 0.000000000 0.007969996 #> 6 0.000000000 0.003303777
Here is a short comparison of running both models 5 times each:
Runtime in seconds without gradients:
#> [1] 0.2155101 0.2029781 0.1960189 0.2026141 0.2020009
Runtime in seconds with gradients:
#> [1] 0.02016687 0.02007699 0.02003384 0.01954794 0.02730393
That's quite a speedup!
Note that you can also pass a C++ function to gpLasso similar to the approach above:
library(RcppArmadillo) library(Rcpp) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const arma::colvec parameters, const arma::mat X, const arma::colvec y, const int N){ // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*parameters)*(y-X*parameters); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * N); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const arma::colvec parameters, const arma::mat X, const arma::colvec y, const int N){ // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*parameters); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/N); return(gradients); }' Rcpp::sourceCpp(code = linreg)
Run the model as before:
l1 <- gpLasso(par = par, regularized = regularized, fn = fitfunction, gr = gradientfunction, lambdas = lambdas, X = Xext, y = y, N = length(y) ) head(l1@parameters)
#> lambda alpha theta b0 b1 b2 b3 b4 b5 b6 b7 b8 #> 1 0.000000000 1 0 0.02738527 1.0129194 0.9991448 0.9705722 1.027624 0.014035744 -0.007459583 0.0185892822 0.021930374 #> 2 0.005263158 1 0 0.02935316 1.0043741 0.9908936 0.9626259 1.025139 0.003366124 0.000000000 0.0143421453 0.015435255 #> 3 0.010526316 1 0 0.02995019 0.9967089 0.9846670 0.9552796 1.021892 0.000000000 0.000000000 0.0096214361 0.010707812 #> 4 0.015789474 1 0 0.03010669 0.9897326 0.9789426 0.9481493 1.018672 0.000000000 0.000000000 0.0049330908 0.006364232 #> 5 0.021052632 1 0 0.03029700 0.9827285 0.9732063 0.9409868 1.015362 0.000000000 0.000000000 0.0001764845 0.002037008 #> 6 0.026315789 1 0 0.03112464 0.9753365 0.9670623 0.9338615 1.011553 0.000000000 0.000000000 0.0000000000 0.000000000 #> b9 b10 #> 1 -0.009900549 0.027401230 #> 2 -0.007938658 0.022297962 #> 3 -0.005256606 0.017465214 #> 4 -0.002393757 0.012713307 #> 5 0.000000000 0.007971003 #> 6 0.000000000 0.003304894
The runtime in seconds with C++ is:
#> [1] 0.01715302 0.01618695 0.01546001 0.01603198 0.01513100
Which is even lower than what we had before!
While using the Rcpp functions defined above was quite fast for our linear regression, it can still be fairly slow for more involved models (e.g., SEM). This is due to our optimizer having to go back and forth between R and C++. To reduce this overhead, we can use the second approach. Here, instead of passing an Rcpp function which is then executed in R, we pass a pointer to the underlying C++ functions. This approach is more constrained than the one presented above:
const Rcpp::NumericVector&
(the parameters) and an Rcpp::List&
(everything else). While this seems restrictive, note
that we can virtually pass anything we want in a list.This may be a bit overwhelming at first, so we will go through it step by step.
We already defined a fitting function and a gradient function for our linear regression model in the example above. However, we often do not know the gradients in closed form. If you don't have a gradient function, you can try a numerical approximation. More details can be found here.
Note that our fitting function and our gradient function do not comply with
the constraints mentioned above. That is, they do take more than two parameters
as arguments (const arma::colvec parameters, const arma::mat X, const arma::colvec y, const int N
),
and these arguments are not a const Rcpp::NumericVector&
and an Rcpp::List&
.
How can we make this work? The parameter vector const Rcpp::NumericVector&
will hold all elements in the arma::colvec pararameters
of our old function.
The Rcpp::List&
must contain all of the other elements (X,y,N
). Let's start
by creating this list, which we will call data:
data <- list("X" = Xext, "y" = y, "N" = length(y))
Next, we have to change our functions to make things work:
linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // our function now only takes the two specified arguments: a // const Rcpp::NumericVector& and an Rcpp::List&. // We have to extract all elements from the list: arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix int N = Rcpp::as<int>(data["N"]); // the sample size // Next, we want to get the parameters as a column-vector: arma::colvec b = Rcpp::as<arma::colvec>(parameters); // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * N); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // our function now only takes the two specified arguments: a // const Rcpp::NumericVector& and an Rcpp::List&. // We have to extract all elements from the list: arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix int N = Rcpp::as<int>(data["N"]); // the sample size // Next, we want to get the parameters as a column-vector: arma::colvec b = Rcpp::as<arma::colvec>(parameters); // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/N); return(gradients); } '
That's it, our functions have been transformed!
This is where it get's really tricky! We can't just pass our functions to C++. However, we can create pointers. These have to be generated in C++ and this can be tricky to get right. To simplify the process, we have created a function which helps setting things up:
cat(lessSEM::makePtrs(fitFunName = "fitfunction", # name of the function in C++ gradFunName = "gradientfunction" # name of the function in C++ ) ) #> #> // INSTRUCTIONS: ADD THE FOLLOWING LINES TO YOUR C++ FUNCTIONS #> #> // IF RCPPARMADILLO IS NOT IMPORTED YET, UNCOMMENT THE FOLLOWING TWO LINES #> // // [[Rcpp::depends(RcppArmadillo)]] #> // #include <RcppArmadillo.h> #> #> // Dirk Eddelbuettel at #> // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ #> #> typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters #> Rcpp::List& //additional elements #> ); #> typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; #> #> typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters #> Rcpp::List& //additional elements #> ); #> typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; #> #> // [[Rcpp::export]] #> fitFunPtr_t fitfunctionPtr() { #> return(fitFunPtr_t(new fitFunPtr(&fitfunction))); #> } #> #> // [[Rcpp::export]] #> gradientFunPtr_t gradientfunctionPtr() { #> return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); #> }
Let's follow the instructions and add the lines to our C++ functions:
linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // our function now only takes the two specified arguments: a // const Rcpp::NumericVector& and an Rcpp::List&. // We have to extract all elements from the list: arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix int N = Rcpp::as<int>(data["N"]); // the sample size // Next, we want to get the parameters as a column-vector: arma::colvec b = Rcpp::as<arma::colvec>(parameters); // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * N); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // our function now only takes the two specified arguments: a // const Rcpp::NumericVector& and an Rcpp::List&. // We have to extract all elements from the list: arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix int N = Rcpp::as<int>(data["N"]); // the sample size // Next, we want to get the parameters as a column-vector: arma::colvec b = Rcpp::as<arma::colvec>(parameters); // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/N); return(gradients); } /// THE FOLLOWING PART IS NEW: // INSTRUCTIONS: ADD THE FOLLOWING LINES TO YOUR C++ FUNCTIONS // IF RCPPARMADILLO IS NOT IMPORTED YET, UNCOMMENT THE FOLLOWING TWO LINES // // [[Rcpp::depends(RcppArmadillo)]] // #include <RcppArmadillo.h> // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunctionPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradientfunctionPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } '
Compile the functions using Rcpp:
Rcpp::sourceCpp(code = linreg)
Great! Now that this is out of the way, we can create the pointers to our functions:
ffp <- fitfunctionPtr() # create the pointer to the fitting function # Note that the name of this function will depend on the name of your fitting function. # For instance, if your fitting function is called sse, then the pointer will be created # with ffp <- ssePtr() gfp <- gradientfunctionPtr() # create the pointer to the gradient function # Note that the name of this function will depend on the name of your gradient function. # For instance, if your gradient function is called sseGradient, then the pointer will be created # with gfp <- sseGradientPtr()
The last step is to call the general purpose optimization. To this end, use the
gpLassoCpp
function:
l1 <- gpLassoCpp(par = par, regularized = regularized, # important: pass the poinnters! fn = ffp, gr = gfp, lambdas = lambdas, # finally, pass the list which the fitting function and the # gradient function need: additionalArguments = data ) head(l1@parameters)
#> lambda alpha theta b0 b1 b2 b3 b4 b5 b6 b7 b8 #> 1 0.000000000 1 0 0.02738542 1.0129198 0.9991455 0.9705733 1.027625 0.014037188 -0.007460885 0.0185907073 0.021930984 #> 2 0.005263158 1 0 0.02935271 1.0043736 0.9908928 0.9626259 1.025139 0.003365862 0.000000000 0.0143413234 0.015434697 #> 3 0.010526316 1 0 0.02995027 0.9967094 0.9846668 0.9552792 1.021892 0.000000000 0.000000000 0.0096220212 0.010707439 #> 4 0.015789474 1 0 0.03010668 0.9897329 0.9789425 0.9481493 1.018672 0.000000000 0.000000000 0.0049333261 0.006364019 #> 5 0.021052632 1 0 0.03029739 0.9827288 0.9732059 0.9409868 1.015363 0.000000000 0.000000000 0.0001773121 0.002036026 #> 6 0.026315789 1 0 0.03112461 0.9753368 0.9670620 0.9338617 1.011553 0.000000000 0.000000000 0.0000000000 0.000000000 #> b9 b10 #> 1 -0.009900023 0.027401255 #> 2 -0.007939417 0.022297387 #> 3 -0.005256688 0.017464767 #> 4 -0.002393495 0.012713141 #> 5 0.000000000 0.007969742 #> 6 0.000000000 0.003303734
Benchmarking this approach results in:
#> [1] 0.01474500 0.01400304 0.01196003 0.01131010 0.01107502
So, we have reduced our runtime even more!
This approach requires a more elaborate setup which is why we have created a
whole package to demonstrate it. You will find more information in the vignette
The-optimizer-interface
and in the lessLM package.
If you just want the optimizers and don't want to depend on the lessSEM package,
we recommend that you copy the lesstimate
C++ library in your packages inst/include folder.
It will come to the same parameter estimates:
#> b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 #> [1,] 0.02734701 1.0129361 0.9991629 0.9705501 1.027728 0.013993181 -0.007491533 0.0186210155 0.021974963 -0.009975776 #> [2,] 0.02939675 1.0043635 0.9908681 0.9626493 1.025035 0.003400965 0.000000000 0.0143089363 0.015388336 -0.007860992 #> [3,] 0.02998680 0.9967117 0.9846504 0.9552960 1.021803 0.000000000 0.000000000 0.0095927088 0.010679371 -0.005183552 #> [4,] 0.03006777 0.9897315 0.9789595 0.9481301 1.018774 0.000000000 0.000000000 0.0049636831 0.006397664 -0.002474334 #> [5,] 0.03032345 0.9827556 0.9731799 0.9409662 1.015441 0.000000000 0.000000000 0.0001374354 0.002100692 0.000000000 #> [6,] 0.03111085 0.9753325 0.9670615 0.9338506 1.011607 0.000000000 0.000000000 0.0000000000 0.000000000 0.000000000 #> b10 #> [1,] 0.027466466 #> [2,] 0.022231196 #> [3,] 0.017406513 #> [4,] 0.012779758 #> [5,] 0.008033475 #> [6,] 0.003352431
And the run times are even lower:
#> [1] 0.002497911 0.001735926 0.001685858 0.001821041 0.001929998
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.