This appendix houses many of the standard definitions and theorems that are used at some point during the narrative. It is targeted for someone reading the book who forgets the precise definition of something and would like a quick reminder of an exact statement. No proofs are given, and the interested reader should consult a good text on Calculus (say, Stewart [@Stewart2008] or Apostol [@Apostol1967], [@ApostolI1967]), Linear Algebra (say, Strang [@Strang1988] and Magnus [@Magnus1999]), Real Analysis (say, Folland [@Folland1999], or Carothers [@Carothers2000]), or Measure Theory (Billingsley [@Billingsley1995], Ash [@Ash2000], Resnick [@Resnick1999]) for details.
We denote sets by capital letters, (A), (B), (C), etc. The letter (S) is reserved for the sample space, also known as the universe or universal set, the set which contains all possible elements. The symbol (\emptyset) represents the empty set, the set with no elements.
Given subsets (A) and (B), we may manipulate them in an algebraic fashion. To this end, we have three set operations at our disposal: union, intersection, and difference. Below is a table summarizing the pertinent information about these operations.
(ref:tab-set-operations)
Table: Set operations.
| Name | Denoted | Defined by elements | (\mathsf{R}) syntax | |--------------+---------------------+-----------------------+-------------------| | Union | (A\cup B) | in (A) or (B) or both | =union(A, B)= | | Intersection | (A\cap B) | in both (A) and (B) | =intersect(A, B)= | | Difference | (A\backslash B) | in (A) but not in (B) | =setdiff(A, B)= | | Complement | (A^{c}) | in (S) but not in (A) | =setdiff(S, A)= |
A function (f) of one variable is said to be one-to-one if no two distinct (x) values are mapped to the same (y=f(x)) value. To show that a function is one-to-one we can either use the horizontal line test or we may start with the equation (f(x_{1}) = f(x_{2})) and use algebra to show that it implies (x_{1} = x_{2}).
Let \(f\) be a function defined on some open interval that contains the number \(a\), except possibly at \(a\) itself. Then we say the *limit of* \(f(x)\) *as* \(x\) *approaches* \(a\) *is* \(L\), and we write \begin{equation} \lim_{x \to a}f(x) = L, \end{equation} if for every \(\epsilon > 0\) there exists a number \(\delta > 0\) such that \(0 < |x-a| < \delta\) implies \(|f(x) - L| < \epsilon\).
\bigskip
A function \(f\) is *continuous at a number* \(a\) if \begin{equation} \lim_{x \to a} f(x) = f(a). \end{equation} The function \(f\) is *right-continuous at the number* \(a\) if \(\lim_{x\to a^{+}}f(x)=f(a)\), and *left-continuous* at \(a\) if \(\lim_{x\to a^{-}}f(x)=f(a)\). Finally, the function \(f\) is *continuous on an interval* \(I\) if it is continuous at every number in the interval.
The *derivative of a function* \(f\) *at a number* \(a\), denoted by \(f'(a)\), is \begin{equation} f'(a)=\lim_{h\to0}\frac{f(a+h)-f(a)}{h}, \end{equation} provided this limit exists. A function is *differentiable at* \(a\) if \(f'(a)\) exists. It is *differentiable on an open interval* \((a,b)\) if it is differentiable at every number in the interval.
In the table that follows, (f) and (g) are differentiable functions and (c) is a constant.
(ref:tab-differentiation-rules)
Table: Differentiation rules.
| (\frac{\mathrm{d}}{\mathrm{d} x}c=0) | (\frac{\mathrm{d}}{\mathrm{d} x}x^{n}=nx^{n-1}) | ((cf)'=cf') | | ((f\pm g)'=f'\pm g') | ((fg)'=f'g+fg') | (\left(\frac{f}{g}\right)'=\frac{f'g-fg'}{g^{2}}) |
\bigskip
```{theorem, name="Chain Rule"} If (f) and (g) are both differentiable and (F=f\circ g) is the composite function defined by (F(x)=f[g(x)]), then (F) is differentiable and (F'(x) = f'[ g(x) ] \cdot g'(x)).
#### Useful Derivatives (ref:tab-useful-derivatives) Table: Some derivatives. | \(\frac{\mathrm{d}}{\mathrm{d} x}\mathrm{e}^{x}=\mathrm{e}^{x}\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\ln x=x^{-1}\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\sin x=\cos x\) | | \(\frac{\mathrm{d}}{\mathrm{d} x}\cos x=-\sin x\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\tan x=\sec^{2}x\) | \(\frac{\mathrm{d}}{\mathrm{d} x}\tan^{-1}x=(1+x^{2})^{-1}\) | | | | | ### Optimization ```{definition} A *critical number* of the function \(f\) is a value \(x^{\ast}\) for which \(f'(x^{\ast})=0\) or for which \(f'(x^{\ast})\) does not exist.
\bigskip
```{theorem, label="first-derivative-test", name="First Derivative Test"} If (f) is differentiable and if (x^{\ast}) is a critical number of (f) and if (f'(x)\geq0) for (x\leq x^{\ast}) and (f'(x)\leq0) for (x\geq x^{\ast}), then (x^{\ast}) is a local maximum of (f). If (f'(x)\leq0) for (x\leq x^{\ast}) and (f'(x)\geq0) for (x\geq x^{\ast}) , then (x^{\ast}) is a local minimum of (f).
\bigskip ```{theorem, name="Second Derivative Test"} If \(f\) is twice differentiable and if \(x^{\ast}\) is a critical number of \(f\), then \(x^{\ast}\) is a local maximum of \(f\) if \(f''(x^{\ast})<0\) and \(x^{\ast}\) is a local minimum of \(f\) if \(f''(x^{\ast})>0\).
As it turns out, there are all sorts of things called "integrals", each defined in its own idiosyncratic way. There are Riemann integrals, Lebesgue integrals, variants of these called Stieltjes integrals, Daniell integrals, Ito integrals, and the list continues. Given that this is an introductory book, we will use the Riemannian integral with the caveat that the Riemann integral is not the integral that will be used in more advanced study.
\bigskip
Let \(f\) be defined on \([a,b]\), a closed interval of the real line. For each \(n\), divide \([a,b]\) into subintervals \([x_{i},x_{i+1}]\), \(i=0,1,\ldots,n-1\), of length \(\Delta x_{i}=(b-a)/n\) where \(x_{0}=a\) and \(x_{n}=b\), and let \(x_{i}^{\ast}\) be any points chosen from the respective subintervals. Then the *definite integral* of \(f\) from \(a\) to \(b\) is defined by \begin{equation} \int_{a}^{b}f(x)\,\mathrm{d} x=\lim_{n\to\infty}\sum_{i=0}^{n-1}f(x_{i}^{\ast})\,\Delta x_{i}, \end{equation} provided the limit exists, and in that case, we say that \(f\) is *integrable* from \(a\) to \(b\).
\bigskip
```{theorem, name="The Fundamental Theorem of Calculus"} Suppose (f) is continuous on ([a,b]). Then
the function (g) defined by (g(x)=\int_{a}^{x}f(t)\:\mathrm{d} t), (a\leq x\leq b), is continuous on ([a,b]) and differentiable on ((a,b)) with (g'(x)=f(x)).
(\int_{a}^{b}f(x)\,\mathrm{d} x=F(b)-F(a)), where (F) is any antiderivative of (f), that is, any function (F) satisfying (F'=f).
#### Change of Variables ```{theorem} If \(g\) is a differentiable function whose range is the interval \([a,b]\) and if both \(f\) and \(g'\) are continuous on the range of \(u = g(x)\), then \begin{equation} \int_{g(a)}^{g(b)}f(u)\:\mathrm{d} u=\int_{a}^{b}f[g(x)]\: g'(x)\:\mathrm{d} x. \end{equation}
(ref:tab-useful-integrals)
Table: Some integrals (constants of integration omitted).
| (\int x^{n}\,\mathrm{d} x=x^{n+1}/(n+1),\ n \neq - 1) | (\int\mathrm{e}^{x}\,\mathrm{d} x=\mathrm{e}^{x}) | (\int x^{-1}\,\mathrm{d} x=\ln \mathrm{abs}(x) ) | | (\int\tan x\:\mathrm{d} x=\ln \mathrm{abs}(\sec x)) | (\int a^{x}\,\mathrm{d} x=a^{x}/\ln a) | (\int(x^{2}+1)^{-1}\,\mathrm{d} x=\tan^{-1}x) |
\begin{equation} \int u\:\mathrm{d} v=uv-\int v\:\mathrm{d} u \end{equation}
\bigskip
```{theorem, name="L'H\^ opital's Rule"} Suppose (f) and (g) are differentiable and (g'(x)\neq0) near (a), except possibly at (a). Suppose that the limit \begin{equation} \lim_{x\to a}\frac{f(x)}{g(x)} \end{equation} is an indeterminate form of type (\frac{0}{0}) or (\infty/\infty). Then \begin{equation} \lim_{x\to a}\frac{f(x)}{g(x)}=\lim_{x\to a}\frac{f'(x)}{g'(x)}, \end{equation} provided the limit on the right-hand side exists or is infinite.
#### Improper Integrals If \(\int_{a}^{t}f(x)\mathrm{d} x\) exists for every number \(t\geq a\), then we define \begin{equation} \int_{a}^{\infty}f(x)\,\mathrm{d} x=\lim_{t\to\infty}\int_{a}^{t}f(x)\,\mathrm{d} x, \end{equation} provided this limit exists as a finite number, and in that case we say that \(\int_{a}^{\infty}f(x)\,\mathrm{d} x\) is *convergent*. Otherwise, we say that the improper integral is *divergent*. If \(\int_{t}^{b}f(x)\,\mathrm{d} x\) exists for every number \(t\leq b\), then we define \begin{equation} \int_{-\infty}^{b}f(x)\,\mathrm{d} x=\lim_{t\to-\infty}\int_{t}^{b}f(x)\,\mathrm{d} x, \end{equation} provided this limit exists as a finite number, and in that case we say that \(\int_{-\infty}^{b}f(x)\,\mathrm{d} x\) is *convergent*. Otherwise, we say that the improper integral is *divergent*. If both \(\int_{a}^{\infty}f(x)\,\mathrm{d} x\) and \(\int_{-\infty}^{a}f(x)\,\mathrm{d} x\) are convergent, then we define \begin{equation} \int_{-\infty}^{\infty}f(x)\,\mathrm{d} x=\int_{-\infty}^{a}f(x)\,\mathrm{d} x+\int_{a}^{\infty}f(x)\mathrm{d} x, \end{equation} and we say that \(\int_{-\infty}^{\infty}f(x)\,\mathrm{d} x\) is *convergent*. Otherwise, we say that the improper integral is *divergent*. ## Sequences and Series {#sec-sequences-and-series} A *sequence* is an ordered list of numbers, \(a_{1}, a_{2}, a_{3}, \ldots, a_{n} = \left(a_{k}\right)_{k=1}^{n}\). A sequence may be finite or infinite. In the latter case we write \(a_{1}, a_{2}, a_{3}, \ldots =\left(a_{k}\right)_{k=1}^{\infty}\). We say that *the infinite sequence* \(\left(a_{k}\right)_{k=1}^{\infty}\) *converges to the finite limit* L, and we write \begin{equation} \lim_{k\to\infty}a_{k} = L, \end{equation} if for every \(\epsilon > 0\) there exists an integer \(N \geq 1\) such that \(|a_{k} - L| < \epsilon\) for all \(k \geq N\). We say that *the infinite sequence* \(\left(a_{k}\right)_{k=1}^{\infty}\) *diverges to* \(+\infty\) (or \(-\infty\)) if for every \(M\geq0\) there exists an integer \(N\geq1\) such that \(a_{k} \geq M\) for all \(k \geq N\) (or \(a_{k} \leq - M\) for all \(k \geq N\)). ### Finite Series \begin{equation} \label{eq-gauss-series} \sum_{k=1}^{n}k=1+2+\cdots+n=\frac{n(n+1)}{2} \end{equation} \begin{equation} \label{eq-gauss-series-sq} \sum_{k=1}^{n}k^{2}=1^{2}+2^{2}+\cdots+n^{2}=\frac{n(n+1)(2n+3)}{6} \end{equation} #### The Binomial Series \begin{equation} \label{eq-binom-series} \sum_{k=0}^{n}{n \choose k}\, a^{n-k}b^{k}=(a+b)^{n} \end{equation} ### Infinite Series Given an infinite sequence of numbers \(a_{1}, a_{2},a_{3},\ldots =\left(a_{k}\right)_{k=1}^{\infty}\), let \(s_{n}\) denote the *partial sum* of the first \(n\) terms: \begin{equation} s_{n}=\sum_{k=1}^{n}a_{k}=a_{1}+a_{2}+\cdots+a_{n}. \end{equation} If the sequence \(\left(s_{n}\right)_{n=1}^{\infty}\) converges to a finite number \(S\) then we say that the infinite series \(\sum_{k}a_{k}\) is *convergent* and write \begin{equation} \sum_{k=1}^{\infty}a_{k}=S. \end{equation} Otherwise we say the infinite series is *divergent*. ### Rules for Series Let \(\left(a_{k}\right)_{k=1}^{\infty}\) and \(\left(b_{k}\right)_{k=1}^{\infty}\) be infinite sequences and let \(c\) be a constant. \begin{equation} \sum_{k=1}^{\infty}ca_{k}=c\sum_{k=1}^{\infty}a_{k} \end{equation} \begin{equation} \sum_{k=1}^{\infty}(a_{k}\pm b_{k})=\sum_{k=1}^{\infty}a_{k}\pm\sum_{k=1}^{\infty}b_{k} \end{equation} In both of the above the series on the left is convergent if the series on the right is (are) convergent. #### The Geometric Series \begin{equation} \label{eq-geom-series} \sum_{k=0}^{\infty} x^{k} = \frac{1}{1 - x},\quad |x| < 1. \end{equation} #### The Exponential Series \begin{equation} \label{eq-exp-series} \sum_{k=0}^{\infty}\frac{x^{k}}{k!} = \mathrm{e}^{x},\quad -\infty < x < \infty. \end{equation} Other Series \begin{equation} \label{eq-negbin-series} \sum_{k=0}^{\infty}{m+k-1 \choose m-1}x^{k}=\frac{1}{(1-x)^{m}},\quad |x|<1. \end{equation} \begin{equation} \label{eq-log-series} -\sum_{k=1}^{\infty}\frac{x^{n}}{n}=\ln(1-x),\quad |x| < 1. \end{equation} \begin{equation} \label{eq-binom-series-infinite} \sum_{k=0}^{\infty}{n \choose k}x^{k}=(1+x)^{n},\quad |x| < 1. \end{equation} ### Taylor Series If the function \(f\) has a *power series* representation at the point \(a\) with radius of convergence \(R>0\), that is, if \begin{equation} f(x)=\sum_{k=0}^{\infty}c_{k}(x-a)^{k},\quad |x - a| < R, \end{equation} for some constants \(\left(c_{k}\right)_{k=0}^{\infty}\), then \(c_{k}\) must be \begin{equation} c_{k}=\frac{f^{(k)}(a)}{k!},\quad k=0,1,2,\ldots \end{equation} Furthermore, the function \(f\) is differentiable on the open interval \((a-R,\, a+R)\) with \begin{equation} f'(x)=\sum_{k=1}^{\infty}kc_{k}(x-a)^{k-1},\quad |x-a| < R, \end{equation} \begin{equation} \int f(x)\,\mathrm{d} x=C+\sum_{k=0}^{\infty}c_{k}\frac{(x-a)^{k+1}}{k+1},\quad |x-a| < R, \end{equation} in which case both of the above series have radius of convergence \(R\). ## The Gamma Function {#sec-the-gamma-function} The *Gamma function* \(\Gamma\) will be defined in this book according to the formula \begin{equation} \Gamma(\alpha)=\int_{0}^{\infty}x^{\alpha-1}\mathrm{e}^{-x}\:\mathrm{d} x,\quad \mbox{for }\alpha > 0. \end{equation} \bigskip ```{block, type="fact"} Properties of the Gamma Function: * \(\Gamma(\alpha)=(\alpha - 1)\Gamma(\alpha - 1)\) for any \(\alpha > 1\), and so \(\Gamma(n)=(n-1)!\) for any positive integer \(n\). * \(\Gamma(1/2)=\sqrt{\pi}\).
A matrix is an ordered array of numbers or expressions; typically we write (\mathbf{A}=\begin{pmatrix}a_{ij}\end{pmatrix}) or (\mathbf{A}=\begin{bmatrix}a_{ij}\end{bmatrix}). If (\mathbf{A}) has (m) rows and (n) columns then we write \begin{equation} \mathbf{A}{\mathrm{m}\times\mathrm{n}}=\begin{bmatrix}a{11} & a_{12} & \cdots & a_{1n}\ a_{21} & a_{22} & \cdots & a_{2n}\ \vdots & \vdots & \ddots & \vdots\ a_{m1} & a_{m2} & \cdots & a_{mn}\end{bmatrix}. \end{equation} The identity matrix (\mathbf{I}{\mathrm{n}\times\mathrm{n}}) is an (\mathrm{n}\times\mathrm{n}) matrix with zeros everywhere except for 1's along the main diagonal: \begin{equation} \mathbf{I}{\mathrm{n}\times\mathrm{n}}=\begin{bmatrix}1 & 0 & \cdots & 0\ 0 & 1 & \cdots & 0\ \vdots & \vdots & \ddots & \vdots\ 0 & 0 & \cdots & 1\end{bmatrix}. \end{equation} and the matrix with ones everywhere is denoted (\mathbf{J}{\mathrm{n}\times\mathrm{n}}): \begin{equation} \mathbf{J}{\mathrm{n}\times\mathrm{n}}=\begin{bmatrix}1 & 1 & \cdots & 1\ 1 & 1 & \cdots & 1\ \vdots & \vdots & \ddots & \vdots\ 1 & 1 & \cdots & 1\end{bmatrix}. \end{equation}
A vector is a matrix with one of the dimensions equal to one, such as (\mathbf{A}{\mathrm{m}\times1}) (a column vector) or (\mathbf{A}{\mathrm{1}\times\mathrm{n}}) (a row vector). The zero vector (\mathbf{0}{\mathrm{n}\times1}) is an (\mathrm{n}\times1) matrix of zeros: \begin{equation} \mathbf{0}{\mathrm{n}\times1}=\begin{bmatrix}0 & 0 & \cdots & 0\end{bmatrix}^{\mathrm{T}}. \end{equation}
The transpose of a matrix (\mathbf{A}=\begin{pmatrix}a_{ij}\end{pmatrix}) is the matrix (\mathbf{A}^{\mathrm{T}}=\begin{pmatrix}a_{ji}\end{pmatrix}), which is just like (\mathbf{A}) except the rows are columns and the columns are rows. The matrix (\mathbf{A}) is said to be symmetric if (\mathbf{A}^{\mathrm{T}}=\mathbf{A}). Note that (\left(\mathbf{A}\mathbf{B}\right)^{\mathrm{T}}=\mathbf{B}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}).
The trace of a square matrix (\mathbf{A}) is the sum of its diagonal elements: (\mathrm{tr}(\mathbf{A})=\sum_{i}a_{ii}).
The inverse of a square matrix (\mathbf{A}{\mathrm{n}\times\mathrm{n}}) (when it exists) is the unique matrix denoted (\mathbf{A}^{-1}) which satisfies (\mathbf{A}\mathbf{A}^{-1}=\mathbf{A}^{-1}\mathbf{A}=\mathbf{I}{\mathrm{n}\times\mathrm{n}}). If (\mathbf{A}^{-1}) exists then we say (\mathbf{A}) is invertible, or nonsingular. Note that (\left(\mathbf{A}^{\mathrm{T}}\right)^{-1}=\left(\mathbf{A}^{\mathrm{-1}}\right)^{\mathrm{T}}).
\bigskip
```{block, type="fact"} The inverse of the (2\times2) matrix \begin{equation} \mathbf{A}=\begin{bmatrix}a & b\ c & d\end{bmatrix}\quad \mbox{is}\quad \mathbf{A}^{-1}=\frac{1}{ad-bc}\begin{bmatrix}d & -b\ -c & a\end{bmatrix}, \end{equation} provided (ad-bc\neq0).
### Determinants ```{definition} The *determinant* of a square matrix \(\mathbf{A}_{\mathrm{n}\times n}\) is denoted \(\mathrm{det}(\mathbf{A})\) or \(|\mathbf{A}|\) and is defined recursively by \begin{equation} \mathrm{det}(\mathbf{A})=\sum_{i=1}^{n}(-1)^{i+j}a_{ij}\,\mathrm{det}(\mathbf{M}_{ij}), \end{equation} where \(\mathbf{M}_{ij}\) is the submatrix formed by deleting the \(i^{\mathrm{th}}\) row and \(j^{\mathrm{th}}\) column of \(\mathbf{A}\). We may choose any fixed \(1\leq j\leq n\) we wish to compute the determinant; the final result is independent of the \(j\) chosen.
\bigskip
```{block, type="fact"} The determinant of the (2\times2) matrix \begin{equation} \mathbf{A}=\begin{bmatrix}a & b\ c & d\end{bmatrix}\quad \mbox{is} \quad |\mathbf{A}|=ad-bc. \end{equation}
\bigskip ```{block, type="fact"} A square matrix \(\mathbf{A}\) is nonsingular if and only if \(\mathrm{det}(\mathbf{A})\neq0\).
If the matrix (\mathbf{A}) satisfies (\mathbf{x^{\mathrm{T}}}\mathbf{A}\mathbf{x}\geq0) for all vectors (\mathbf{x}\neq\mathbf{0}), then we say that (\mathbf{A}) is positive semidefinite. If strict inequality holds for all (\mathbf{x}\neq\mathbf{0}), then (\mathbf{A}) is positive definite. The connection to statistics is that covariance matrices (see Chapter [[#cha-Multivariable-Distributions]]) are always positive semidefinite, and many of them are even positive definite.
If (f) is a function of two variables, its first-order partial derivatives are defined by \begin{equation} \frac{\partial f}{\partial x}=\frac{\partial}{\partial x}f(x,y)=\lim_{h\to0}\frac{f(x+h,\, y)-f(x,y)}{h} \end{equation} and \begin{equation} \frac{\partial f}{\partial y}=\frac{\partial}{\partial y}f(x,y)=\lim_{h\to0}\frac{f(x,\, y+h)-f(x,y)}{h}, \end{equation} provided these limits exist. The second-order partial derivatives of (f) are defined by \begin{equation} \frac{\partial^{2}f}{\partial x^{2}}=\frac{\partial}{\partial x}\left(\frac{\partial f}{\partial x}\right),\quad \frac{\partial^{2}f}{\partial y^{2}}=\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial y}\right),\quad \frac{\partial^{2}f}{\partial x\partial y}=\frac{\partial}{\partial x}\left(\frac{\partial f}{\partial y}\right),\quad \frac{\partial^{2}f}{\partial y\partial x}=\frac{\partial}{\partial y}\left(\frac{\partial f}{\partial x}\right). \end{equation} In many cases (and for all cases in this book) it is true that \begin{equation} \frac{\partial^{2}f}{\partial x\partial y}=\frac{\partial^{2}f}{\partial y\partial x}. \end{equation}
An function (f) of two variables has a local maximum at ((a,b)) if (f(x,y)\geq f(a,b)) for all points ((x,y)) near ((a,b)), that is, for all points in an open disk centered at ((a,b)). The number (f(a,b)) is then called a local maximum value of (f). The function (f) has a local minimum if the same thing happens with the inequality reversed.
Suppose the point ((a,b)) is a critical point of (f), that is, suppose ((a,b)) satisfies \begin{equation} \frac{\partial f}{\partial x}(a,b)=\frac{\partial f}{\partial y}(a,b)=0. \end{equation} Further suppose (\frac{\partial^{2}f}{\partial x^{2}}) and (\frac{\partial^{2}f}{\partial y^{2}}) are continuous near ((a,b)). Let the Hessian matrix (H) (not to be confused with the hat matrix (\mathbf{H}) of Chapter [[#cha-multiple-linear-regression]]) be defined by \begin{equation} H = \begin{bmatrix} \frac{\partial^{2}f}{\partial x^{2}} & \frac{\partial^{2}f}{\partial x\partial y}\ \frac{\partial^{2}f}{\partial y\partial x} & \frac{\partial^{2}f}{\partial y^{2}} \end{bmatrix}. \end{equation} We use the following rules to decide whether ((a,b)) is an extremum (that is, a local minimum or local maximum) of (f).
Let (f) be defined on a rectangle (R=[a,b]\times[c,d]), and for each (m) and (n) divide ([a,b]) (respectively ([c,d])) into subintervals ([x_{j},x_{j+1}]), (i=0,1,\ldots,m-1) (respectively ([y_{i},y_{i+1}])) of length (\Delta x_{j}=(b-a)/m) (respectively (\Delta y_{i}=(d-c)/n)) where (x_{0}=a) and (x_{m}=b) (and (y_{0}=c) and (y_{n}=d) ), and let (x_{j}^{\ast}) ((y_{i}^{\ast})) be any points chosen from their respective subintervals. Then the double integral of (f) over the rectangle (R) is \begin{equation} \iintop_{R}f(x,y)\,\mathrm{d} A=\intop_{c}^{d}!!!\intop_{a}^{b}f(x,y)\,\mathrm{d} x\mathrm{d} y=\lim_{m,n\to\infty}\sum_{i=1}^{n}\sum_{j=1}^{m}f(x_{j}^{\ast},y_{i}^{\ast})\Delta x_{j}\Delta y_{i}, \end{equation} provided this limit exists. Multiple integrals are defined in the same way just with more letters and sums.
Suppose we have a transformation[^math01] (T) that maps points ((u,v)) in a set (A) to points ((x,y)) in a set (B). We typically write (x=x(u,v)) and (y=y(u,v)), and we assume that (x) and (y) have continuous first-order partial derivatives. We say that (T) is one-to-one if no two distinct ((u,v)) pairs get mapped to the same ((x,y)) pair; in this book, all of our multivariate transformations (T) are one-to-one.
[^math01]: For our purposes (T) is in fact the inverse of a one-to-one transformation that we are initially given. We usually start with functions that map ((x,y) \longmapsto (u,v)), and one of our first tasks is to solve for the inverse transformation that maps ((u,v)\longmapsto(x,y)). It is this inverse transformation which we are calling (T).
The Jacobian (pronounced "yah-KOH-bee-uhn") of (T) is denoted by (\partial(x,y)/\partial(u,v)) and is defined by the determinant of the following matrix of partial derivatives: \begin{equation} \frac{\partial(x,y)}{\partial(u,v)}=\left| \begin{array}{cc} \frac{\partial x}{\partial u} & \frac{\partial x}{\partial v}\ \frac{\partial y}{\partial u} & \frac{\partial y}{\partial v} \end{array} \right|=\frac{\partial x}{\partial u}\frac{\partial y}{\partial v}-\frac{\partial x}{\partial v}\frac{\partial y}{\partial u}. \end{equation}
If the function (f) is continuous on (A) and if the Jacobian of (T) is nonzero except perhaps on the boundary of (A), then \begin{equation} \iint_{B}f(x,y)\,\mathrm{d} x\,\mathrm{d} y=\iint_{A}f\left[x(u,v),\, y(u,v)\right]\ \left|\frac{\partial(x,y)}{\partial(u,v)}\right|\mathrm{d} u\,\mathrm{d} v. \end{equation}
A multivariate change of variables is defined in an analogous way: the one-to-one transformation (T) maps points ((u_{1},u_{2},\ldots,u_{n})) to points ((x_{1},x_{2},\ldots,x_{n})), the Jacobian is the determinant of the (\mathrm{n}\times\mathrm{n}) matrix of first-order partial derivatives of (T) (lined up in the natural manner), and instead of a double integral we have a multiple integral over multidimensional sets (A) and (B).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.