knitr::opts_chunk$set( collapse = TRUE, comment = "#>", cache=TRUE, fig.height=6, fig.width=8 )
Here, we use MNIST package developped by \@stillmatic as sample data.
You can install this package like the following:
devtools::install_github("stillmatic/MNIST")
Once you install stillmatic/MNIST
, MNIST data is exported as MNIST::mnist_train
.
Example the number 8
MNIST::show_digit(MNIST::mnist_train[770,])
There are 60,000 records in the data, it is little bit too much data for usual SVD (for usual PC).
That's why we would like to do sampling here.
df <- MNIST::mnist_train[sample(seq_len(nrow(MNIST::mnist_train)), size=10^4), ]
Plot the original data on the first and second singular vector plane.
# Last column is y column x <- as.matrix(df[, -ncol(df)])/255 y <- df$y frequentdirections::plot_svd(x, y)
eps <- 10^(-8) # 10000 x 256 -> 8 * 256 matrix b <- frequentdirections::sketching(x, 8, eps) frequentdirections::plot_svd(x, y, b)
# 10000 x 256 -> 32 * 256 matrix b <- frequentdirections::sketching(x, 32, eps) frequentdirections::plot_svd(x, y, b)
# 10000 x 256 -> 128 * 256 matrix b <- frequentdirections::sketching(x, 128, eps) frequentdirections::plot_svd(x, y, b)
This result is almost the same with the original data SVD expression.
That's why we can think that the original data is expressed with only 128
rows.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.