knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
To use the REigenfaces package, you first need to install and (optionally) load it via:
knitr::opts_chunk$set(echo = TRUE) library(REigenfaces)
devtools::install("path/to/REigenfaces")
library(REigenfaces)
For the remainder of this example, we will use the lfwcrop_grey training data set that can be found on https://towardsdatascience.com/eigenfaces-recovering-humans-from-ghosts-17606c328184 (21.07.2020). Feel free to use a different data set or your own training data. Note that a controlled and uniform lightning environment is important in order for the eigenface method to work properly, as will be explained in section 3.
A subset of the lfwcrop_grey data set is included in the package as images
and the computed eigenfaces as dataset
(for more information: ?images
, ?dataset
). To load a different training data set and determine the eigenfaces use:
my_images <- load_pgm_images("path/to/data", pattern="(0001)|(0002)|(0003)|(0004)", max_images=1000L)
to load the images in pgm format and
my_dataset <- load_dataset(my_dataset_images, max_eigenfaces=100L)
to compute the eigenfaces. "path/to/data"
is the path to the data as a character
vector. pattern
is an optional regular expression. Only file names which match the regular expression will be loaded. In our example data set, there are multiple images of the same person. By utilizing the pattern, only a maximum of four images of the same person are used. Alternatively, you can restict the number of images considered for eigenface computation via max_images
. The number of eigenfaces that will be determined can be adjusted via max_eigenfaces
. Since all major computations are performed here, this can take quite some time, depending on the size of your data.
To display the loaded images use
show_images(images, dataset$image_size)
You can now display the 16 most important eigenfaces of the data set with:
faces <- most_important_eigenfaces(dataset, 16L) show_images(faces, dataset$image_size)
The mean face of the training data set can be displayed via
show_images(dataset$mean, dataset$image_size)
We now want to search for nine faces in our training data set, which are similar to Billy Joels face. To do so, we first need to load the image:
billy <- load_pgm_images("path/to/image", pattern="Billy_Joel")
show_images(billy, dataset$image_size)
Afterwards, we can obtain the indices of the similar faces using
print(similar_faces_indices(dataset, billy, max_count=9L))
The output comprises the names of the similar images and their indices in the dataset
object sorted by descending similarity as well as coefficients, indicating how similar the faces are. As we can see, Billy Joel is already in our data set, because he is the first name in the list with a coefficient of 0.
To display the eigenface reconstruction of the original training data images, simply do:
faces <- reconstructed_dataset_images(dataset, 1:16) show_images(faces, dataset$image_size)
The REigenfaces Package provides an interactive shiny application to play around with the functionalities described in Section 1. To start this app simply execute REigenfaces::runShiny()
.
You can select one of the 100 faces in the example dataset
with the index selection in the top left. In the top right the eigenface reconstruction of the selected face using up to 64 eigenfaces is displayed. In the bottom left, four similar faces in the dataset are displayed and the plot in the bottom right shows the contribution coefficient of each eigenface to the reconstruction.
Eigenfaces are an approach to recognize faces using principal component analysis (PCA). The concept was introduced by in Sirovich and Kirby in 1987 and adopted and refined by Turk and Pentland in 1991. Even though the technique has lost popularity in recent years, it is still widely used to visualize the concept and results of PCA and introduce beginners to machine learning.
The basic idea is to find a low dimensional subspace of the high dimensional image space. Consider a set of $d$ images. Using the eigenfaces approach, it is possible to find a basis set of $k << d$ images. They are determined by the $k$ directions of maximum variance. PCA is used to find the $k$ vectors $\epsilon_{1},…,\epsilon_{k}$ spanning the low dimensional space. Turk and Pentland called them eigenfaces. All face images of the training data set can now be reconstructed by linear combinations of the eigenfaces. As a result, data and statistical complexity can be reduced significantly.
Consider a training data set of $d$ face images $I_{1},...,I_{d}$ with $I_{i} \in \mathbb{R}^{m\times n}$. The eigenface generation can be summarised as follows:
Convert the face images $I_{i} \in \mathbb{R}^{m\times n}$ to face vectors $I_{i} \in \mathbb{R}^{mn}$ by concatenating the pixel rows.
Compute the mean image: $\bar{I} := \frac{1}{d} \sum_{i=1}^{d}{I_i}$.
Since we are interested in variance, normalize the face vectors by subtracting the mean face: $\Phi_i := I_i - \bar{I}$.
Let $A := [\Phi_1...\Phi_d] \in \mathbb{R}^{mn \times d}$.
Compute the covariance matrix $C = AA^T \in \mathbb{R}^{mn \times mn}$ (or $C = A^TA \in \mathbb{R}^{d \times d}$ to reduce dimensionality, since $d < mn$ in many cases).
Compute and sort eigenvalues $|\lambda_1| \geq ... \geq |\lambda_d|$ and corresponding eigenvectors $\epsilon_1,...,\epsilon_d$ of the covariance matrix $C$. The first $k$ eigenvectors $\epsilon_1,...,\epsilon_k$ with an arbitrary $\mathbb{N} \ni k << d$ are the eigenfaces.
Pros:
Cons:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.