Gauss-Seidel and Jacobi iterations to solve Ax=b

To explore this portion, let's first construct a diagonally-dominant matrix A so that we can create a linear system that will be solveable with these iterative methods.

library(stsci6520hw2)
n <- 1e2
A <- matrix(rnorm(n*n),nrow=n)
diag(A) <- 0
diag(A) <- apply(A,1,FUN=function(x) sum(abs(x)))

Then we can pick a random x, compute b, and our task is to recover x from A and b:

x <- rnorm(n)
b <- A %*% x

To solve for x using the Gauss-Seidel method, we can run:

x_gs <- solve_ols(A,b,method="Gauss-Seidel",niter=1e2)

The output can be silenced with verbose=F:

x_gs <- solve_ols(A,b,method="Gauss-Seidel",niter=1e2,verbose=F)

To use the Jacobi method, we just change method="Gauss-Seidel" to method="Jacobi":

x_j <- solve_ols(A,b,method="Jacobi",niter=1e2,verbose=F)

By default, the Jacobi method uses one core, but it can by run in parallel by providing ncores=4 or whatever number of cores is suitable for your machine, though in this application, the parallelization requires much more time in overhead than it makes up for in parallel execution, leading to a longer runtime:

x_j <- solve_ols(A,b,method="Jacobi",niter=1e2,verbose=F,ncores=4)

Now we can see if we recovered x by checking the mean or maximum absolute deviation over all the entries. The deviations are very small. For the vignette, the parallel Jacobi method is not evaluated, so the deviations are based on the ncore=1 function call.

mean(abs(x - x_gs))
max(abs(x - x_gs))
mean(abs(x - x_j))
max(abs(x - x_j))

Algorithmic Leveraging

For the algorithmic leveraging, let's construct the same data as in the paper and HW1:

library(dplyr)
set.seed(1)
n <- 500
# data as described in Figure 1
d <- data.frame(x=rt(n,df=6)) %>%
  mutate(y=-x+rnorm(n))
m <- lm(y~x,d)
plot(d$x,d$y,col="gray")
abline(m)

We can look at the conventional, full-data model summary to see what we're shooting for:

summary(m)

To produce subsample estimates of the intercept and slope, we need to provide the x and y vectors, along with r, the size of the subsample. We can also specify whether we want a uniform subsample, or a leverage-based subsample. The default is leverage:

lev <- algo_leverage(d$x,d$y,r=5,method="leverage")

The first two items returned are the estimated intercept and slope based on a subsample:

lev[c(1,2)]

The third is a list of the indices used in the subsample:

lev[3]

We can also use a uniform subsample by specifying method="uniform":

uni <- algo_leverage(d$x,d$y,5,method="uniform")
uni

We can plot the subsamples and resulting estimated lines. Blue is the leverage-based, red is uniformly subsampled. Black is the full-data estimate.

plot(d$x,d$y,col="gray")
abline(m)
points(d$x[uni[[3]]],d$y[uni[[3]]],col="red",pch=4)
points(d$x[lev[[3]]],d$y[lev[[3]]],col="blue",pch=2)
abline(a = lev[[1]],
       b = lev[[2]],col="blue")
abline(a = uni[[1]],
       b = uni[[2]],col="red")

Elastic net

To explore the elastic net function, we can generate some correlated data as in HW 1. We scale the design matrix X, since the implementation assumes that the columns are standardized.

set.seed(12345)
n <- 100
Sig <- diag(20)
Sig[1,2] <- Sig[2,1] <- Sig[5,6] <- Sig[6,5] <- 0.8
L <- chol(Sig)
p <- 20
beta <- c(2,0,-2,0,1,0,-1,0,rep(0,12))
X <- scale(t(t(L) %*% matrix(rnorm(n*p),nrow=20)))
y <- X %*% beta + rnorm(n)

To compute the elastic net estimates, we provide y, the design matrix X, and our choices of alpha and lambda. We hope to have non-zero estimates for the first, third, fifth, and seventh coefficients, since we have chosen those to be non-zero. We also hope to have non-zero estimates for the second and sixth, since, though in the generative model they are zero, they are correlated with the non-zero factors:

elnet_ests <- elnet_coord(y,X,alpha=0.5,lambda=1)
which(elnet_ests!=0)
barplot(elnet_ests)
elnet_ests


DavidJamesKent/stsci6520_hw2 documentation built on May 26, 2019, 12:30 a.m.