\newtheorem{question}{Question}

source("../../setup.R")

A <- matrix(1:6, nrow = 2, ncol = 3)
B <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE)
C <- matrix(1:3, nrow = 2, ncol = 3)
M <- matrix(c(1, 4, 2, 1), nrow = 2, ncol = 2)
park_mat <- cbind(c(62, 71, 66), c(115, 201, 119), c(4000, NA, 2000))
parks_mat <- cbind(c(62, 71, 66), c(115, 201, 119), c(4000, NA, 2000))
dimnames(parks_mat) <- list(c("Leslie", "Ron", "April"), c("Height", "Weight", "Income"))

```{js, echo=FALSE} $(function() { $('.ace_editor').each(function( index ) { ace.edit(this).setFontSize("20px"); }); })

## Learning Objectives {-}

After studying this chapter, you should be able to:

* Create matrices with `matrix()`, `rbind()`, and `cbind()`.

* Understand how R stores and performs operations on matrices.

* Add row and column names to two-dimensional objects.

* Extract and assign values to two-dimensional objects.

* Differentiate between `*` and `%*%` for matrices.

* Compute the inverse of a square matrix.


## Basic Definitions and Functions

Matrices play an important role in statistics, especially in multivariate analyses, such as multiple regression or estimating the covariance matrix for a collection of random variables (i.e., a random vector). We will not focus on linear algebra or matrix theory in this course, but we will introduce the `matrix` object in R and some important concepts related to them. Matrices provide intuition and a natural segue into working with data frames, possibly the most commonly used object in R to store rectangular data. Most of the syntax and functions that apply to matrices will also apply to data frames.

### The `matrix()` Function

A **matrix** is a two-dimensional (rectangular) array of values. In R, *every value in a matrix must be of the same type* (i.e., all numeric, character, logical). The **`matrix()`** function takes in a vector of values (called `data`) and arranges them into a matrix according to the desired number of rows (`nrow`) and number of columns (`ncol`).

```r
A <- matrix(1:6, nrow = 2, ncol = 3)
A

Note: By default, the matrix() function reads the input vector and fills in the matrix by column. For example, in matrix A, the first two elements fill in the first column, the next two elements in the second column, and the last two elements fill in the last column. To fill the matrix in by row, set the optional byrow argument to TRUE (the default value is byrow=FALSE, which means R will fill the matrix by column). Try this below to get the following output:

B <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE)
B
B <- matrix(1:9, nrow = 3, ncol = 3)
B
B <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE)
B

Note: The default values are nrow=1 and ncol=1. Leaving both blank will produce a matrix with a single column. If only the nrow or ncol argument is defined, the argument left blank will be automatically computed based on the length of the data vector.

Caution: If the length of the data vector is too short to fill the entire matrix, the values of data will be recycled. The way R behaves when recycling in this context is similar to the vector recycling from Chapter 2. If the vector is recycled a whole number of times, R will fill the matrix with no warning that it has recycled the data values. If the vector is not recycled completely, the matrix will still be filled, but R will also throw a warning.

matrix(1:4, nrow = 2, ncol = 6) # Complete recycling: no warning thrown

matrix(1:4, nrow = 2, ncol = 5) # Incomplete recycling: warning thrown

The Dimension of a Matrix

The dimension of a matrix (or any two-dimensional object) is the number of rows and the number of columns, often written as $\hbox{nrow} \times \hbox{ncol}$, read aloud as "nrow by ncol". For example, matrix $A$ is a $2 \times 3$ (2 by 3) matrix. A matrix is called square if the number of rows equals the number of columns.

The dim() function returns a numeric vector specifying the dimension of the input object.

dim(A) # dimension of A

dim(B) # dimension of B

Caution: The order matters in the dimension of a matrix. A $2 \times 3$ matrix has a different dimension from a $3 \times 2$ matrix. The number of rows is always listed first, and the number of columns is always listed second.

The functions nrow() and ncol() input a two-dimensional object and output the number of rows or columns, respectively. These will produce the individual entries from the dim() function.

Find the number of rows in matrix A and the number of columns in matrix B:


nrow(A)
nrow(A)
ncol(B)
nrow(A)
ncol(B)
question("How would you extract the number of rows from the output of `dim(A)`?",
         answer("Not possible"),
         answer("dim(A[1])"),
         answer("dim(A[2])"),
         answer("dim(A)[1]", correct = TRUE),
         answer("dim(A)[2]"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

The cbind() and rbind() Functions

Another way to create a matrix is to use the cbind() and rbind() functions, which combine ("bind") columns and rows (respectively) into a matrix. Input each column or row values as a separate argument, as shown below:

## An alternative way to create the matrix A
cbind(c(1, 2), c(3, 4), c(5, 6))
## An alternative way to create the matrix B
rbind(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9))

If the number of columns or rows match in a way that makes sense, i.e., they are conformable, then rbind() and cbind() can also input two-dimensional objects and bind them together. Trying to combine matrices that are not conformable will produce an error.

# Bind the rows of A and B together (i.e., stack A on top of B)
# The below will give an error, because the number
# of rows are not conformable
cbind(A, B)
## Bind the rows of A and B together (i.e., stack A on top of B)
rbind(A, B)

Side Note: The rbind() and cbind() functions also use recycling when necessary. This is particularly useful when inserting a column or row of repeated values. For an application to statistics: In linear regression (especially multiple regression), the observed values of the predictor variable(s) are often organized into a matrix, called the design matrix. The design matrix typically has a column of 1's to account for the intercept term in the linear model. An example of appending a column of 1's is given below.

cbind() # Append a column of 1's to matrix B

Matrices Are Two-Dimensional Vectors

The reason why every value in a matrix must be of the same type is because all matrix objects are actually internally stored as vectors in R. This can be verified by using the mode() function on a matrix.

## Find out how matrix objects (i.e. matrix A) are stored in R with the above function

Matrices are just vectors with an additional attribute called dimension (dim). Vectors have no dimension attribute (i.e., vectors are not just one-dimensional matrices).

The attributes() function allows us to see the attributes of an R object.

attributes(A)
attributes(1:6)

We could strip the A matrix of its dim attribute by assigning NULL to its attributes(A) object. The matrix object will revert back to a vector.

 # Remove all of A's attributes
A # A is now a vector
attributes(A) <- NULL # Remove all of A's attributes
A # A is now a vector
attributes(A) <- NULL # Remove all of A's attributes
A # A is now a vector

We can also give a vector the dim attribute in a similar way to convert a vector into a matrix.

## Assign the dim attribute to A (with 2 rows and 3 columns)
attributes(A) <- list(dim = )
A # A is a matrix again
## Assign the dim attribute to A (with 2 rows and 3 columns)
attributes(A) <- list(dim = c(2, 3))
A # A is a matrix again
## Assign the dim attribute to A (with 2 rows and 3 columns)
attributes(A) <- list(dim = c(2, 3))
A # A is a matrix again

Naming Rows and Columns of Two-Dimensional Objects

Suppose we have data on a few employees at the Pawnee Parks and Recreation Department, shown in a data table below.

\begin{table}[htbp!] \centering \begin{tabular}{cccc} \hline Name & Height (inches) & Weight (pounds) & Income (\$/month) \ \hline Leslie & 62 & 115 & 4000 \ Ron & 71 & 201 & (Redacted) \ April & 66 & 119 & 2000 \ \hline \end{tabular} \end{table}

We can input the numeric data into a matrix in R.

parks_mat <- cbind(c(62, 71, 66), c(115, 201, 119), c(4000, NA, 2000))
parks_mat # Make sure the data was entered correctly
question("How else can we make the parks_mat matrix? Choose all that apply.",
         answer("rbind(c(62, 71, 66), c(115, 201, 119), c(4000, NA, 2000))"),
         answer("rbind(c(62, 115, 4000), c(71, 201, NA), c(66, 119, 2000))",
                correct = TRUE),
         answer("matrix(62, 71, 66, 115, 201, 119, 4000, NA, 2000, byrow = FALSE)"),
         answer("matrix(62, 71, 66, 115, 201, 119, 4000, NA, 2000, nrow = 3)",
                correct = TRUE),
         random_answer_order = TRUE,
         allow_retry = TRUE)

While the numeric values from the dataset have been entered, the names of the employees and the names of the variables have been lost. To add names to the rows and columns of a two-dimensional object, we can use the rownames() and colnames() functions.

Applying the rownames() and colnames() functions to a two-dimensional object will retrieve (get), respectively, the current row and column names. If there are no names, the functions will output NULL.

See what the row and column names of park_mat are:


rownames(park_mat)
colnames(park_mat)

To set the names, we can create a character vector of names and use the assignment <- operator on the row and column names.

rownames(parks_mat) <- c("Leslie", "Ron", "April")
colnames(parks_mat) <- c("Height", "Weight", "Income")
parks_mat

Technically, these functions access the row and column name attributes of the input object. Setting names will add the dimnames attribute to the object.

## Check the attributes of parks_mat

The attributes(parks_mat) object is an example of a list object in R. We will go over the syntax and functions for lists in a later chapter. For now, notice that the dimnames attribute has two components. The first component is the vector of row names, and the second component is the vector of column names.

Side Note: The dimnames() function can get and set both the row and column name attributes at once. The assignment input using dimnames() needs to be a list with two vector components.

dimnames(parks_mat) <- list(NULL, NULL) # Remove the row and column names
parks_mat
## Add the same names as before
dimnames(parks_mat) <- list(c("Leslie", "Ron", "April"), c("Height", "Weight", "Income"))
dimnames(parks_mat)

Side Note: There are a few other ways to add names to rows and columns. The matrix() function has an optional dimnames argument that allows us to add names directly when creating a matrix object. The syntax is the same as the dimnames() function.

matrix(1:9, nrow = 3, ncol = 3, dimnames = list(c("a", "b", "c"), c("A", "B", "C")))

The rbind() and cbind() functions allow us to name, respectively, each row or column by just typing the name of the row or column in quotation marks, as shown below. Setting names for the unnamed dimension would still need to be done separately.

rbind("a" = 1:3, "b" = 4:6, "c" = 7:9)
cbind("A" = 1:3, "B" = 4:6, "C" = 7:9)

Extracting Data From Two-Dimensional Objects

Recall that square brackets are used to extract specific parts of data from objects in R. Vectors are one-dimensional objects, so a single number inside of square brackets extracts that element from a vector.

For two-dimensional objects, there are two index coordinates, corresponding to the row index and the column index. The index inside the square brackets needs to be an ordered pair, separated by a comma. For example, an index of [i,j] means to extract the entry in the ith row and jth column, also called the $(i,j)$th entry.

Numeric Indices

Leaving one value blank means to extract all the values in that dimension. Positive, negative, and fractional indices work the same way as they do for vectors. The row and column indices are independent of each other, so a positive index for one can be mixed with a negative index for the other.

B # Notice the row and column indices in the output
B[2, 1] # Extract the (2,1) element
B[2, ] # Extract the second row
B[, 3] # Extract the third column
B[, -2] # Remove the second column
B[-1, c(2, 3)] # Remove the first row and extract the second and third columns

Note: Notice that when the resulting output is one-dimensional (i.e., a single row or a single column), the output object is a vector, not a one-dimensional matrix.

Caution: Using a single index [] instead of an index pair [,] will not throw a warning or an error for matrix objects. Since a matrix is stored as one long (column) vector, R will interpret the single index as a vector index and extract the values from the stored vector version of the matrix. The matrix values are stored in column-major storage order, meaning the values are stored in order from top to bottom down the first column, then top to bottom down the second column, etc. Using a single index for matrices is not recommended, since it can be hard to keep track of which entry in a two-dimensional array corresponds to a single index.

B[8] # Do not extract elements of matrices this way
A
question("Although not how we should index elements, what does A[4] return? (A is the above matrix)",
         answer("4", correct = TRUE),
         answer("It throws an error"),
         answer("2"),
         random_answer_order = TRUE,
         allow_retry = TRUE)

Logical Indices

Logical vectors can also be used for subsetting two-dimensional objects. The logical indices work the same way as they do for vectors. This can be a useful way to extract rows or columns that satisfy certain conditions.

## Which heights are above 65 inches?
tall_index <- parks_mat[, 1] > 65
## Extract only the rows/observations for people who are taller than 65 inches
parks_mat[tall_index, ]

Question: How can we use the tall_index vector to extract only the rows/observations for people in the data who are at most 65 inches tall?

tall_index <- parks_mat[, 1] > 65
tall_index <- parks_mat[, 1] > 65
parks_mat[!tall_index, ]
tall_index <- parks_mat[, 1] > 65
parks_mat[!tall_index, ]

Named (Character) Indices

If the rows or columns of a two-dimensional object are named, we can use the name as an index. Row names can only be used as a row index, and column names can only be used as a column index.

parks_mat["Leslie", ] # Extract the row of data for Leslie
parks_mat[, "Income"] # Extract the column of data for Income
parks_mat["Ron", "Height"] # Extract the height of Ron

\newpage

parks_mat[] # Extract the weight of Leslie and April
parks_mat[c("Leslie", "April"), "Weight"] # Extract the weight of Leslie and April

Note: Notice that we do not need to know the numeric index for the observations or variables. Using names as indices can be useful if the object has many rows or columns and the numeric index is not as easy to use.

Matrix Operations

Entrywise Arithmetic Operations

Because matrices are stored as vectors, the usual arithmetic operations (+, -, *, /, ^, %%, %/%) that apply to numeric vectors will also work for numeric matrices. Just like for vectors, arithmetic operations will be applied entrywise. However, for arithmetic operations between matrices to make sense, the matrices need to be conformable. We can only apply the usual arithmetic operators to matrices with the same dimensions (same number of rows and columns). The corresponding entries of the two matrices will be operated together.

A + 10 # Add 10 to every entry in A

A^2 # Square each entry in A

C <- matrix(1:3, nrow = 2, ncol = 3) # Construct a matrix C
C

A + C # Add A and C
A^C # Exponentiate A by C
A * C # Multiply A and C (entrywise)

If we try to apply the operators to matrices that are not conformable, R will throw an error.

A * B # A and B are not conformable

Matrix Multiplication

Caution: The entrywise multiplication performed using the regular multiplication * operator is not the way matrices are multiplied together in linear algebra.

For matrix multiplication, two matrices are conformable if the number of columns in the left matrix has to be the same as the number of rows in the right matrix. If $A$ is an $m \times n$ matrix and $B$ is an $n \times p$ matrix, then $A$ and $B$ are conformable, and $AB$ will be an $m \times p$ matrix.

For example, let $A$ be a $2 \times 3$ matrix and $B$ be a $3 \times 2$ matrix, denoted by $$ A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \end{bmatrix} \hspace{0.1in} \hbox{and} \hspace{0.1in} B = \begin{bmatrix} b_{11} & b_{12} \ b_{21} & b_{22} \ b_{31} & b_{32} \end{bmatrix}. $$ The matrix product $AB$ is the $2 \times 2$ matrix given by $$ AB = \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} + a_{13} b_{31} & a_{11} b_{12} + a_{12} b_{22} + a_{13} b_{32} \ a_{21} b_{11} + a_{22} b_{21} + a_{23} b_{31} & a_{21} b_{12} + a_{22} b_{22} + a_{23} b_{32} \ \end{bmatrix}. $$ Written generally, if $A = \left[a_{ik}\right]{m \times n}$ and $B = \left[b{kj}\right]{n \times p}$, then $\displaystyle AB = \left[\sum{k=1}^n a_{ik}b_{kj} \right]_{m \times p}$. The $(i,j)$th entry of the product $AB$ is the dot product of the $i$th row of $A$ with the $j$th column of $B$.

Matrix multiplication is performed using the %*% operator.

A %*% C # Non-conformable error
A %*% B # Matrix multiplication of A and B

Notice that matrix A and C are conformable for entrywise multiplication but are not conformable for matrix multiplication, while A and B are conformable for matrix multiplication but not for entrywise.

question("Which of the following operations will NOT return an error? Choose all that apply.",
         answer("A + C", correct = TRUE),
         answer("A + B"),
         answer("A *%* B"),
         answer("A %*% C"),
         answer("A ^ C", correct = TRUE),
         answer("A ^ B"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

The Identity Matrix

The identity matrix of size $n$, denoted by $I_n$ (or simply $I$ if the dimension is implicit) is an $n \times n$ square matrix with 1's in the main diagonal entries and 0 everywhere else. Formally, the $(i,j)$th entry of $I_n$ is 1 if $i = j$ and $0$ if $i \neq j$, written mathematically as $$ \begin{bmatrix}I_n\end{bmatrix}_{ij} = \begin{cases} 1 & \hbox{if $i = j$} \ 0 & \hbox{if $i \neq j$}. \end{cases} $$ For example: $$I_1 = \begin{bmatrix} 1 \end{bmatrix}, \hspace{0.1in} I_2 = \begin{bmatrix} 1 & 0 \ 0 & 1 \end{bmatrix}, \hspace{0.1in} I_3 = \begin{bmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 1 \end{bmatrix}, \ldots$$ The identity matrix essentially plays the role of the number 1 for matrix multiplication. That is, for any $n \times n$ matrix $A$, a property of matrix multiplication is that $A I_n = I_n A = A$.

The diag() function has two main functionalities. By inputting a number, the diag() function will generate an identity matrix of that size.

diag(4) # Create a 4x4 identity matrix

By inputting a vector, the diag() function will generate a diagonal matrix (the only nonzero entries are along the diagonal) with the vector values along the diagonal.

diag(c(1, 2, 3)) # Create a diagonal matrix with 1, 2, 3 along the diagonal

The nrow and ncol arguments can also be used to specify the dimensions for a rectangular (non-square) matrix.

diag(c(1, 2, 3), nrow = 3, ncol = 4)

The Inverse of a Matrix

The inverse of an $n \times n$ square matrix $A$ is the unique $n \times n$ matrix, denoted by $A^{-1}$, such that $$A A^{-1} = A^{-1} A = I_n.$$ The inverse of a matrix is analogous to the reciprocal (also called the multiplicative inverse) for numbers. For example, the reciprocal of $2$ is $2^{-1} = \frac{1}{2}$, since $2 \times \frac{1}{2} = \frac{1}{2} \times 2 = 1$.

The function solve() computes the inverse of the inputted matrix. For example:

M <- matrix(c(1, 4, 2, 1), nrow = 2, ncol = 2) # Create a matrix M
M
M_inv <- solve(M) # Compute the inverse of M with the solve() function
M_inv

## Verify that M_inv is the inverse of M (BOTH WAYS)
M <- matrix(c(1, 4, 2, 1), nrow = 2, ncol = 2) # Create a matrix M
M
M_inv <- solve(M) # Compute the inverse of M with the solve() function
M_inv

## Verify that M_inv is the inverse of M (BOTH WAYS)
M %*% M_inv
M_inv %*% M

Caution: The inverse of a (square) matrix does not always exist. A matrix that has an inverse is called invertible or nonsingular. There are conditions in linear algebra for when an inverse exists. A matrix that does not have an inverse is called singular. If we try to invert a singular matrix, R will throw an error.

solve(B)

## Error in solve.default(B): system is computationally singular: reciprocal condition\ ## number = 2.59052e-18

Operations on Matrix Columns and Rows

The apply() Function

Suppose we want to compute the mean of each variable in the parks_mat matrix.

parks_mat

We could compute these individually, but it would require repetitive code (or a for loop).

mean(parks_mat[, "Height"], na.rm = TRUE) # Or mean(parks_mat[, 1], na.rm = TRUE)
mean(parks_mat[, "Weight"], na.rm = TRUE) # Or mean(parks_mat[, 2], na.rm = TRUE)
mean(parks_mat[, "Income"], na.rm = TRUE) # Or mean(parks_mat[, 3], na.rm = TRUE)

For large matrices (or other data objects you will see later), using repetitive code is inefficient and cumbersome.

The apply() function is used to apply a function to the rows or columns (the margins) of matrices, arrays (higher dimension matrices), and data frames (which you will see soon).

Similar to vapply(), the syntax of apply() is apply(X, MARGIN, FUN, ...), where the arguments are:

Using apply(), we can apply the mean() function to each column in parks_mat simultaneously with a single command.

## Compute the mean of every column of the parks_mat matrix
apply(parks_mat, , ,na.rm = TRUE)
## Compute the mean of every column of the parks_mat matrix
apply(parks_mat, 2, , na.rm = TRUE)
## Compute the mean of every column of the parks_mat matrix
apply(parks_mat, 2, mean, na.rm = TRUE)

To compute the mean of each row, we can change the margin argument MARGIN from 2 (columns) to 1 (rows).

## Compute the mean of every row of the parks_mat matrix
apply(parks_mat, , , na.rm = TRUE)
## Think about what we did for column means. What do we need to change?
## Compute the mean of every row of the parks_mat matrix
apply(parks_mat, 1, mean, na.rm = TRUE)
## Compute the mean of every row of the parks_mat matrix
apply(parks_mat, 1, mean, na.rm = TRUE)

Side Note: Technically, the mean of every row or column in a matrix (or data frame) can also be computed using the rowMeans() and colMeans() functions, but apply() works more generally, since apply() allows us to apply any function, not just mean().

Note: If the applied function in the FUN argument of apply() returns a single value, the output of the apply() function will be a vector. If the applied function returns a vector (with more than one element), then the output of apply() will be a matrix.

## Compute the range (min and max) of every column of the parks_mat matrix
apply(parks_mat, , , na.rm = TRUE)
## Compute the range (min and max) of every column of the parks_mat matrix
apply(parks_mat, 2, range, na.rm = TRUE)

Note: The FUN argument does not have to be a built-in function. We can also write our own function and apply it to each row or column.

As an example, suppose we want to compute the squared deviations from the mean for each variable in the parks_mat matrix.

squared_devs <- function(x, na.rm = FALSE) {
  # This function inputs a vector and computes the squared deviations away from the mean.
  (x - mean(x, na.rm = na.rm))^2
}
## Apply the squared_devs() function to every column of the parks_mat matrix
apply(parks_mat, 2, squared_devs, na.rm = TRUE)

The function can also be written directly into the FUN argument without having to save it as a separate object.

## Creates the same object as apply(parks_mat, 2, squared_devs, na.rm = TRUE)
apply(parks_mat, 2, function(x) {
  (x - mean(x, na.rm = TRUE))^2
})

Final Quiz

matrix(1:10, nrow = 3, byrow = TRUE)
question("Which of the following lines of code will construct the above matrix? Choose all applicable options.",
         answer("matrix(1:10, nrow = 3, byrow = TRUE)", correct = TRUE),
         answer("matrix(1:10, nrow = 3)"),
         answer("matrix(c(1:10, 1, 2), ncol = 4, byrow = TRUE)", correct = TRUE),
         answer("c(1:10, 1, 2, attributes = matrix)"),
         random_answer_order = TRUE,
         allow_retry = TRUE)
question("What is the mode of the output of apply(parks_mat, 2, range, na.rm = TRUE)?",
         answer("numeric", correct = TRUE),
         answer("matrix"),
         answer("data.frame"),
         answer("vector"),
         answer("character"),
         random_answer_order = TRUE,
         allow_retry = TRUE)
E <- matrix(1:8, nrow = 2, dimnames = list(c("A", "B"), c("a", "b", "c", "d")))
E
question("The above matrix is matrix E. What is the output of the code E['a', 'B']?",
         answer("An error", correct = TRUE),
         answer("2"),
         answer("3"),
         answer("1"),
         answer("4"),
         random_answer_order = TRUE,
         allow_retry = TRUE)
question("What code would produce the same result as diag(1:4, nrow = 3)? Select all that apply.",
         answer("diag(1:3)", correct = TRUE),
         answer("matrix(1:3, nrow = 3)"),
         answer("matrix(c(1, 0, 0, 0, 2, 0, 0, 0, 3), nrow = 3, byrow = TRUE)",
                correct = TRUE),
         answer("diag(4)"),
         answer("matrix(c(1, rep(0, 4), 2, rep(0, 4), 3, rep(0, 4), 4), nrow = 3)"),
         answer("An error. 1:4 is has length 4, but nrow is only 3"),
         random_answer_order = TRUE,
         allow_retry = TRUE)


elmstedt/UCLAstats20 documentation built on Oct. 24, 2020, 8:55 p.m.