Fill in Blanks
Home
1.1 Bivariate Relationships
1.2 Probabilistic Models
1.3 Estimation of the Line
1.4 Properties of the Least Squares Estimators
1.5 Estimation of the Variance
2.1 The Normal Errors Model
2.2 Inferences for the Slope
2.3 Inferences for the Intercept
2.4 Correlation and Coefficient of Determination
2.5 Estimating the Mean Response
2.6 Predicting the Response
3.1 Residual Diagnostics
3.2 The Linearity Assumption
3.3 Homogeneity of Variance
3.4 Checking for Outliers
3.5 Correlated Error Terms
3.6 Normality of the Residuals
4.1 More Than One Predictor Variable
4.2 Estimating the Multiple Regression Model
4.3 A Primer on Matrices
4.4 The Regression Model in Matrix Terms
4.5 Least Squares and Inferences Using Matrices
4.6 ANOVA and Adjusted Coefficient of Determination
4.7 Estimation and Prediction of the Response
5.1 Multicollinearity and Its Effects
5.2 Adding a Predictor Variable
5.3 Outliers and Influential Cases
5.4 Residual Diagnostics
5.5 Remedial Measures
4.3 A Primer on Matrices
"If you had done something twice, you are likely to do it again."
- Brian Kernighan and Bob Pike (The Unix Programming Environment, p. 97)
A matrix is a rectangular array of elements arranged in rows and columns.
An example of
a matrix is:
$$
\begin{align*}
\left[\begin{array}{ccc}
8.3 & 70 & 10.3\\
8.6 & 65 & 10.3\\
8.8 & 63 & 10.2\\
10.5 & 72 & 16.4\\
\end{array}\right]
\end{align*}
$$
This matrix represents some of the data from the trees dataset. The values in the first column represents Girth, the second column represents Height, and the third column represents Volume.
Each row corresponds to a tree. The first row represents the values for the first tree. It has 8.3 for Girth, 70 for Height, and 10.3 for Volume.
So this matrix gives the values of three variables for four trees.
Each row corresponds to a tree. The first row represents the values for the first tree. It has 8.3 for Girth, 70 for Height, and 10.3 for Volume.
So this matrix gives the values of three variables for four trees.
Each value of the matrix is called an element of that matrix. We denote the elements as $a_{ij}$ for the element in the $i$th row and the $j$th column. Note that the first subscript identifies the row number and the second the column number.
So for the matrix above, the elements can be denotes as $$ \begin{align*} \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\\ a_{41} & a_{42} & a_{43} \end{array}\right] \end{align*} $$ A matrix may be denoted by a symbol such as $\bf{A}$, $\bf{X}$, or $\bf{Z}$. The matrix could also be a greek symbol such as $\bf{\Omega}$. The symbol is in boldface to identify that it refers to a matrix.
Thus, we might define for the above matrix; $$ \begin{align*} \bf{A} =\left[\begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\\ a_{41} & a_{42} & a_{43} \end{array}\right] \end{align*} $$ Another notation we could use is: $$ \textbf{A}=\left[a_{ij}\right]\qquad i=1,\ldots,4; j=1,2,3 $$ This notation avoids the need for writing out all elements of the matrix by stating only the general element.
Sometimes we will specify the matrix with the dimension below the matrix symbol. For example, a $r$ x $c$ matrix can be expressed as \begin{align*} \underset{r\times c}{{\bf A}}=\left[a_{ij}\right]\qquad i=1,\ldots,r; j=1,\ldots,c \end{align*}
So for the matrix above, the elements can be denotes as $$ \begin{align*} \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\\ a_{41} & a_{42} & a_{43} \end{array}\right] \end{align*} $$ A matrix may be denoted by a symbol such as $\bf{A}$, $\bf{X}$, or $\bf{Z}$. The matrix could also be a greek symbol such as $\bf{\Omega}$. The symbol is in boldface to identify that it refers to a matrix.
Thus, we might define for the above matrix; $$ \begin{align*} \bf{A} =\left[\begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\\ a_{41} & a_{42} & a_{43} \end{array}\right] \end{align*} $$ Another notation we could use is: $$ \textbf{A}=\left[a_{ij}\right]\qquad i=1,\ldots,4; j=1,2,3 $$ This notation avoids the need for writing out all elements of the matrix by stating only the general element.
Sometimes we will specify the matrix with the dimension below the matrix symbol. For example, a $r$ x $c$ matrix can be expressed as \begin{align*} \underset{r\times c}{{\bf A}}=\left[a_{ij}\right]\qquad i=1,\ldots,r; j=1,\ldots,c \end{align*}
The dimension of the matrix above is 4 x 3, since there are four rows and three columns.
Recall that thetrees dataset has 31 observations. So a matrix representing the full dataset would be 31 x 3.
Note that in giving the dimension of a matrix, we always specify the number ofrows first and then the number of columns .
So a $r$ x $c$ matrix can be expressed as \begin{align*} \underset{r\times c}{{\bf A}} & =\left[\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1c}\\ a_{21} & a_{22} & \cdots & a_{2c}\\ \vdots & \vdots & \ddots & \vdots\\ a_{r1} & a_{r2} & \cdots & a_{rc} \end{array}\right] \end{align*} or in the compact form \begin{align*} \underset{r\times c}{{\bf A}} & =\left[a_{ij}\right]\qquad i=1,\ldots,r;j=1,\ldots,c \end{align*} Again, the dimensions may or may not be given under the matrix symbol.
Recall that the
Note that in giving the dimension of a matrix, we always specify the number of
So a $r$ x $c$ matrix can be expressed as \begin{align*} \underset{r\times c}{{\bf A}} & =\left[\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1c}\\ a_{21} & a_{22} & \cdots & a_{2c}\\ \vdots & \vdots & \ddots & \vdots\\ a_{r1} & a_{r2} & \cdots & a_{rc} \end{array}\right] \end{align*} or in the compact form \begin{align*} \underset{r\times c}{{\bf A}} & =\left[a_{ij}\right]\qquad i=1,\ldots,r;j=1,\ldots,c \end{align*} Again, the dimensions may or may not be given under the matrix symbol.
A matrix is said to be square if the number of rows equals the number of columns. For example, the matrices
$$
\begin{align*}
\left[\begin{array}{cc}
a_{11} & a_{12}\\
a_{21} & a_{22}
\end{array}\right]
\end{align*}
$$
and
$$
\begin{align*}
\left[\begin{array}{ccc}
a_{11} & a_{12} & a_{13}\\
a_{21} & a_{22} & a_{23}\\
a_{31} & a_{32} & a_{33}\\
\end{array}\right]
\end{align*}
$$
are both square matrices.
A matrix containing only one column is called a column vector or simply a vector .
Two
examples are:
$$
\begin{align*}
\textbf{A}=\left[\begin{array}{c}
1\\
20\\
7
\end{array}\right] & \qquad\textbf{B}=\left[\begin{array}{c}
b_{1}\\
b_{2}\\
b_{3}\\
b_{4}\\
b_{5}
\end{array}\right]
\end{align*}
$$
Note that the elements only have one subscript in $\bf{B}$ since there is only one column. The subscript indicates only the row.
A matrix containing only one row is called arow vector .
Two examples are: $$ \begin{align*} \textbf{B}^{\prime}=\left[\begin{array}{ccc} 15 & 25 & 50\end{array}\right] & \qquad\boldsymbol{\delta}^{\prime}=\left[\begin{array}{cc} \delta_{1} & \delta_{2}\end{array}\right] \end{align*} $$ We use the prime (${}^\prime$) symbol for row vectors for reasons to be seen next.
A matrix containing only one row is called a
Two examples are: $$ \begin{align*} \textbf{B}^{\prime}=\left[\begin{array}{ccc} 15 & 25 & 50\end{array}\right] & \qquad\boldsymbol{\delta}^{\prime}=\left[\begin{array}{cc} \delta_{1} & \delta_{2}\end{array}\right] \end{align*} $$ We use the prime (${}^\prime$) symbol for row vectors for reasons to be seen next.
The transpose of a matrix $\bf{A}$ is another matrix, denoted by $\textbf{A}^{\prime}$, that is obtained by interchanging
corresponding columns and rows of the matrix $\bf{A}$.
For example, if: $$ \begin{align*} \underset{3\times2}{\textbf{A}}=\left[\begin{array}{cc} 1 & 7\\ 12 & 4\\ 5 & 9 \end{array}\right] \end{align*} $$ then the transpose $\bf{A}^\prime$ is: $$ \begin{align*} \underset{2\times3}{\textbf{A}^{\prime}}=\left[\begin{array}{ccc} 1 & 12 & 5\\ 7 & 4 & 9 \end{array}\right] \end{align*} $$ Note that the first column of $\bf{A}$ is the first row of $\bf{A}^\prime$, and similarly the second column of $\bf{A}$ is the second row of $\bf{A}^\prime$.
Note that the dimension of $\bf{A}$ becomes reversed for the dimension of $\bf{A}^\prime$.
S Note that the transpose of a column vector is arow vector , and vice versa.
This is the reason why we used the symbol $\bf{B}^\prime$ earlier to identify a row vector, since it may be thought of as the transpose of a column vector $\bf{B}$.
For example, if: $$ \begin{align*} \underset{3\times2}{\textbf{A}}=\left[\begin{array}{cc} 1 & 7\\ 12 & 4\\ 5 & 9 \end{array}\right] \end{align*} $$ then the transpose $\bf{A}^\prime$ is: $$ \begin{align*} \underset{2\times3}{\textbf{A}^{\prime}}=\left[\begin{array}{ccc} 1 & 12 & 5\\ 7 & 4 & 9 \end{array}\right] \end{align*} $$ Note that the first column of $\bf{A}$ is the first row of $\bf{A}^\prime$, and similarly the second column of $\bf{A}$ is the second row of $\bf{A}^\prime$.
Note that the dimension of $\bf{A}$ becomes reversed for the dimension of $\bf{A}^\prime$.
S Note that the transpose of a column vector is a
This is the reason why we used the symbol $\bf{B}^\prime$ earlier to identify a row vector, since it may be thought of as the transpose of a column vector $\bf{B}$.
A matrix is said to be symmetric if $\bf{A}=\bf{A}^\prime$.
A symmetric matrix $\bf{A}$ has elements $a_{ij}=a_{ji}$. Clearly, a symmetric matrix must be a square matrix.
A square matrix is said to be diagonal if all of the off-diagonal elements are zero.
For example \begin{align*} {\bf A} & =\left[\begin{array}{cccc} a_{11} & 0 & 0 & 0\\ 0 & a_{22} & 0 & 0\\ 0 & 0 & a_{33} & 0\\ 0 & 0 & 0 & a_{44} \end{array}\right] \end{align*} is a diagonal matrix.
For example \begin{align*} {\bf A} & =\left[\begin{array}{cccc} a_{11} & 0 & 0 & 0\\ 0 & a_{22} & 0 & 0\\ 0 & 0 & a_{33} & 0\\ 0 & 0 & 0 & a_{44} \end{array}\right] \end{align*} is a diagonal matrix.
The identity matrix is a diagonal matrix with ones for all the diagonal elements. The identity matrix is denoted with with $\bf{I}$.
For Example \begin{align*} {\bf I} & =\left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right] \end{align*} is a 4 x 4 identity matrix.
For Example \begin{align*} {\bf I} & =\left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right] \end{align*} is a 4 x 4 identity matrix.
A matrix of with ones for all the elements is denoted as
\begin{align*}
{\bf J} & =\left[\begin{array}{cccc}
1 & 1 & \cdots & 1\\
1 & 1 & \cdots & 1\\
\vdots & \vdots & \ddots & \vdots\\
1 & 1 & \cdots & 1
\end{array}\right]
\end{align*}
A vector with ones for all the elements is denoted as
\begin{align*}
{\bf 1} & =\left[\begin{array}{c}
1\\
1\\
\vdots\\
1
\end{array}\right]
\end{align*}
Likewise a vector of zeros is denoted as
\begin{align*}
{\bf 0} & =\left[\begin{array}{c}
0\\
0\\
\vdots\\
0
\end{array}\right]
\end{align*}
Adding or subtracting two matrices requires that they have the same dimension .
The sum, or difference, of two matrices is another matrix whose elements each consist of the sum, or difference, of the corresponding elements of the two matrices.
Suppose: $$ \begin{align*} \underset{3\times2}{\textbf{A}}=\left[\begin{array}{cc} 1 & 4\\ 2 & 5\\ 3 & 6 \end{array}\right] & \qquad\underset{3\times2}{\textbf{B}}=\left[\begin{array}{cc} 1 & 2\\ 2 & 3\\ 3 & 4 \end{array}\right] \end{align*} $$ then: $$ \begin{align*} \underset{3\times2}{\textbf{A}+\textbf{B}=} & \left[\begin{array}{cc} 1+1 & 4+2\\ 2+2 & 5+3\\ 3+3 & 6+4 \end{array}\right]=\left[\begin{array}{cc} 2 & 6\\ 4 & 8\\ 6 & 10 \end{array}\right] \end{align*} $$ Similarly: $$ \begin{align*} \underset{3\times2}{\textbf{A}-\textbf{B}=} & \left[\begin{array}{cc} 1-1 & 4-2\\ 2-2 & 5-3\\ 3-3 & 6-4 \end{array}\right]=\left[\begin{array}{cc} 0 & 2\\ 0 & 2\\ 0 & 2 \end{array}\right] \end{align*} $$
The sum, or difference, of two matrices is another matrix whose elements each consist of the sum, or difference, of the corresponding elements of the two matrices.
Suppose: $$ \begin{align*} \underset{3\times2}{\textbf{A}}=\left[\begin{array}{cc} 1 & 4\\ 2 & 5\\ 3 & 6 \end{array}\right] & \qquad\underset{3\times2}{\textbf{B}}=\left[\begin{array}{cc} 1 & 2\\ 2 & 3\\ 3 & 4 \end{array}\right] \end{align*} $$ then: $$ \begin{align*} \underset{3\times2}{\textbf{A}+\textbf{B}=} & \left[\begin{array}{cc} 1+1 & 4+2\\ 2+2 & 5+3\\ 3+3 & 6+4 \end{array}\right]=\left[\begin{array}{cc} 2 & 6\\ 4 & 8\\ 6 & 10 \end{array}\right] \end{align*} $$ Similarly: $$ \begin{align*} \underset{3\times2}{\textbf{A}-\textbf{B}=} & \left[\begin{array}{cc} 1-1 & 4-2\\ 2-2 & 5-3\\ 3-3 & 6-4 \end{array}\right]=\left[\begin{array}{cc} 0 & 2\\ 0 & 2\\ 0 & 2 \end{array}\right] \end{align*} $$
The addition and subtraction rules discussed above are fairly straight forward and similar to addition and subtraction of (non-matrix) numbers.
Multiplication of matrices are not as straight forward as multiplication of (non-matrix) numbers.
Multiplication of matrices are not as straight forward as multiplication of (non-matrix) numbers.
A scalar is an ordinary number or a symbol representing a number.
In multiplication of a matrix by a scalar, every element of the matrix is multiplied by the scalar.
For example, suppose the matrix $\textbf{A}$ is given by: $$ \begin{align*} \textbf{A}=\left[\begin{array}{cc} 1 & 3\\ 5 & 7 \end{array}\right] \end{align*} $$ Then $2\textbf{A}$, where 2 is the scalar, equals: $$ \begin{align*} 2\textbf{A}=2\left[\begin{array}{cc} 1 & 3\\ 5 & 7 \end{array}\right] & =\left[\begin{array}{cc} 2 & 6\\ 10 & 14 \end{array}\right] \end{align*} $$
In multiplication of a matrix by a scalar, every element of the matrix is multiplied by the scalar.
For example, suppose the matrix $\textbf{A}$ is given by: $$ \begin{align*} \textbf{A}=\left[\begin{array}{cc} 1 & 3\\ 5 & 7 \end{array}\right] \end{align*} $$ Then $2\textbf{A}$, where 2 is the scalar, equals: $$ \begin{align*} 2\textbf{A}=2\left[\begin{array}{cc} 1 & 3\\ 5 & 7 \end{array}\right] & =\left[\begin{array}{cc} 2 & 6\\ 10 & 14 \end{array}\right] \end{align*} $$
Consider the two matrices:
$$
\begin{align*}
\underset{2\times2}{\textbf{A}}=\left[\begin{array}{cc}
1 & 2\\
3 & 4
\end{array}\right] & \qquad\underset{2\times2}{\textbf{B}}=\left[\begin{array}{cc}
5 & 6\\
7 & 8
\end{array}\right]
\end{align*}
$$
Multiplying $\bf{A}$ by $\bf{B}$ is found by a multiplying the elements of each row vector by the elements of each each column vector and then summing the products.
For example, to find the element in the first row and the first column of the product $\textbf{AB}$, we work with the first row of $\textbf{A}$ and the first column of $\textbf{B}$: $$ \begin{align*} \begin{array}{cc} & \textbf{A}\\ & \left[\begin{array}{cc} {\color{red}1} & {\color{red}2}\\ 3 & 4 \end{array}\right]\\ \\ \end{array}\begin{array}{c} \textbf{B}\\ \left[\begin{array}{cc} {\color{red}5} & 6\\ {\color{red}7} & 8 \end{array}\right]\\ \begin{array}{cc} & \end{array} \end{array} & =\begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} \color{red}{\left(1\right)\left(5\right)+\left(2\right)\left(7\right)} &\\ \\ \end{array}\right]\\ \\ \end{array}\\ & = \begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} \color{red}{19} &\\ \\ \end{array}\right]\\ \\ \end{array} \end{align*} $$ To find the element in the first row and second column of $\textbf{AB}$: $$ \begin{align*} \begin{array}{cc} & \textbf{A}\\ & \left[\begin{array}{cc} {\color{red}1} & {\color{red}2}\\ 3 & 4 \end{array}\right]\\ \\ \end{array}\begin{array}{c} \textbf{B}\\ \left[\begin{array}{cc} 5 & \color{red}{6}\\ 7 & \color{red}{8} \end{array}\right]\\ \begin{array}{cc} & \end{array} \end{array} & =\begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} 19& \color{red}{\left(1\right)\left(6\right)+\left(2\right)\left(8\right)} \\ \\ \end{array}\right]\\ \\ \end{array}\\ & = \begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} 19 & \color{red}{22}\\ \\ \end{array}\right]\\ \\ \end{array} \end{align*} $$ Continuing this process we get $$ \begin{align*} \underset{2\times2}{\textbf{AB}} & =\left[\begin{array}{cc} \left(1\right)\left(5\right)+\left(2\right)\left(7\right) & \left(1\right)\left(6\right)+\left(2\right)\left(8\right)\\ \left(3\right)\left(5\right)+\left(4\right)\left(7\right) & \left(3\right)\left(6\right)+\left(4\right)\left(8\right) \end{array}\right]=\left[\begin{array}{cc} 19 & 22\\ 43 & 50 \end{array}\right] \end{align*} $$ Note that the order in matrix multiplication is important. In general, $\textbf{AB} \ne \textbf{BA}$. In fact, even though the product $\textbf{AB}$ may be defined, the product $\textbf{BA}$ may not be defined at all.
In general, the product $\textbf{AB}$ is defined only when the number of columns in $\textbf{A}$ equals the number of rows in $\textbf{B}$.
For example: $$ \begin{align*} \underset{{\color{red}2}\times{\color{blue}3}}{\textbf{A}} & \quad\underset{{\color{blue}3}\times{\color{red}1}}{\textbf{B}}=\underset{{\color{red}2}\times{\color{red}1}}{\textbf{AB}} \end{align*} $$ is defined since the number of columns of $\textbf{A}$ (3) is equal to the number of rows of $\textbf{B}$ (3).
However, note that $$ \begin{align*} \underset{{\color{blue}3}\times{\color{red}1}}{\textbf{B}}\quad\underset{{\color{red}2}\times{\color{blue}3}}{\textbf{A}} \end{align*} $$ is not defined since the number of columns of $\textbf{B}$ (1) is not equal to the number of rows of $\textbf{A}$ (2).
When obtaining the product $\textbf{AB}$, we say that $\textbf{A}$ ispostmultiplied by $\textbf{B}$ or $\textbf{B}$ is premultiplied
by $\textbf{A}$.
For example, to find the element in the first row and the first column of the product $\textbf{AB}$, we work with the first row of $\textbf{A}$ and the first column of $\textbf{B}$: $$ \begin{align*} \begin{array}{cc} & \textbf{A}\\ & \left[\begin{array}{cc} {\color{red}1} & {\color{red}2}\\ 3 & 4 \end{array}\right]\\ \\ \end{array}\begin{array}{c} \textbf{B}\\ \left[\begin{array}{cc} {\color{red}5} & 6\\ {\color{red}7} & 8 \end{array}\right]\\ \begin{array}{cc} & \end{array} \end{array} & =\begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} \color{red}{\left(1\right)\left(5\right)+\left(2\right)\left(7\right)} &\\ \\ \end{array}\right]\\ \\ \end{array}\\ & = \begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} \color{red}{19} &\\ \\ \end{array}\right]\\ \\ \end{array} \end{align*} $$ To find the element in the first row and second column of $\textbf{AB}$: $$ \begin{align*} \begin{array}{cc} & \textbf{A}\\ & \left[\begin{array}{cc} {\color{red}1} & {\color{red}2}\\ 3 & 4 \end{array}\right]\\ \\ \end{array}\begin{array}{c} \textbf{B}\\ \left[\begin{array}{cc} 5 & \color{red}{6}\\ 7 & \color{red}{8} \end{array}\right]\\ \begin{array}{cc} & \end{array} \end{array} & =\begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} 19& \color{red}{\left(1\right)\left(6\right)+\left(2\right)\left(8\right)} \\ \\ \end{array}\right]\\ \\ \end{array}\\ & = \begin{array}{cc} & \textbf{AB}\\ & \left[\begin{array}{cc} 19 & \color{red}{22}\\ \\ \end{array}\right]\\ \\ \end{array} \end{align*} $$ Continuing this process we get $$ \begin{align*} \underset{2\times2}{\textbf{AB}} & =\left[\begin{array}{cc} \left(1\right)\left(5\right)+\left(2\right)\left(7\right) & \left(1\right)\left(6\right)+\left(2\right)\left(8\right)\\ \left(3\right)\left(5\right)+\left(4\right)\left(7\right) & \left(3\right)\left(6\right)+\left(4\right)\left(8\right) \end{array}\right]=\left[\begin{array}{cc} 19 & 22\\ 43 & 50 \end{array}\right] \end{align*} $$ Note that the order in matrix multiplication is important. In general, $\textbf{AB} \ne \textbf{BA}$. In fact, even though the product $\textbf{AB}$ may be defined, the product $\textbf{BA}$ may not be defined at all.
In general, the product $\textbf{AB}$ is defined only when the number of columns in $\textbf{A}$ equals the number of rows in $\textbf{B}$.
For example: $$ \begin{align*} \underset{{\color{red}2}\times{\color{blue}3}}{\textbf{A}} & \quad\underset{{\color{blue}3}\times{\color{red}1}}{\textbf{B}}=\underset{{\color{red}2}\times{\color{red}1}}{\textbf{AB}} \end{align*} $$ is defined since the number of columns of $\textbf{A}$ (3) is equal to the number of rows of $\textbf{B}$ (3).
However, note that $$ \begin{align*} \underset{{\color{blue}3}\times{\color{red}1}}{\textbf{B}}\quad\underset{{\color{red}2}\times{\color{blue}3}}{\textbf{A}} \end{align*} $$ is not defined since the number of columns of $\textbf{B}$ (1) is not equal to the number of rows of $\textbf{A}$ (2).
When obtaining the product $\textbf{AB}$, we say that $\textbf{A}$ is
For ordinary (non-matrix) numbers, the inverse of a number is its reciprocal. Thus, the inverse of 2 is $\frac{1}{2}$
A number multiplied by its inverse always equals 1: $$ \begin{align*} &2\cdot\frac{1}{2}=\frac{1}{2}\cdot2=1 \end{align*} $$
In matrix algebra, the inverse of a matrix $\textbf{A}$ is another matrix, denoted by $\textbf{A}^{-1}$, such that: $$ \textbf{A}^{-1}\textbf{A}=\textbf{A}\textbf{A}^{-1}=\textbf{I} $$ where $\textbf{I}$ is the identity matrix.
Thus, the identity matrix $\textbf{I}$ plays the same role as the number 1 in ordinary algebra.
An inverse of a matrix is defined only forsquare matrices .
Even so, many square matrices do not have inverses.
If a square matrix does have an inverse, the inverse is unique.
If a the inverse of a matrix does not exist, then we say the matrix issingular . If the inverse does exist, then we say the matrix is nonsingular .
A number multiplied by its inverse always equals 1: $$ \begin{align*} &2\cdot\frac{1}{2}=\frac{1}{2}\cdot2=1 \end{align*} $$
In matrix algebra, the inverse of a matrix $\textbf{A}$ is another matrix, denoted by $\textbf{A}^{-1}$, such that: $$ \textbf{A}^{-1}\textbf{A}=\textbf{A}\textbf{A}^{-1}=\textbf{I} $$ where $\textbf{I}$ is the identity matrix.
Thus, the identity matrix $\textbf{I}$ plays the same role as the number 1 in ordinary algebra.
An inverse of a matrix is defined only for
Even so, many square matrices do not have inverses.
If a square matrix does have an inverse, the inverse is unique.
If a the inverse of a matrix does not exist, then we say the matrix is
Below are some basic results for matrices presented without proof. They will be useful as we use matrices in regression.
$$
\begin{align*}
\textbf{A}+\textbf{B} & =\textbf{B}+\textbf{A} & (4.7)\\
\left(\textbf{A}+\textbf{B}\right)+\textbf{C} & =\textbf{A}+\left(\textbf{B}+\textbf{C}\right) &(4.8)\\
\left(\textbf{A}\textbf{B}\right)\textbf{C} & =\textbf{A}\left(\textbf{B}\textbf{C}\right)&(4.9)\\
\textbf{C}\left(\textbf{A}+\textbf{B}\right) & =\textbf{C}\textbf{A}+\textbf{C}\textbf{B}&(4.10)\\
k\left(\textbf{A}+\textbf{B}\right) & =k\textbf{A}+k\textbf{B}&(4.11)\\
\left(\textbf{A}^{\prime}\right)^{\prime} & =\textbf{A}&(4.12)\\
\left(\textbf{A}+\textbf{B}\right)^{\prime} & =\textbf{A}^{\prime}+\textbf{B}^{\prime}&(4.13)\\
\left(\textbf{A}\textbf{B}\right)^{\prime} & =\textbf{B}^{\prime}\textbf{A}^{\prime}&(4.14)\\
\left(\textbf{A}\textbf{B}\textbf{C}\right)^{\prime} & =\textbf{C}^{\prime}\textbf{B}^{\prime}\textbf{A}^{\prime}&(4.15)\\
\left(\textbf{A}^{-1}\right)^{-1} & =\textbf{A}&(4.16)\\
\left(\textbf{A}^{\prime}\right)^{-1} & =\left(\textbf{A}^{-1}\right)^{\prime}&(4.17)
\end{align*}
$$
There are a number of results when using matrix calculus which are beyond the scope of this course. We will present a few results for matrix differentiation that will be useful in multiple regression.
It is important to note that matrix calculus can be confusing due to notational conventions that are used in various fields. There are two main conventions (although the two are sometimes mixed by some authors) that are based how to take a derivative with respect to a vector. One convention is thenumerator layout and the other is the denominator layout. Below, we will present the results using numerator layout.
In all the results that follow, let $d$ be a scalar, ${\bf A}$ be a $n\times1$ vector with elements $[a_{i}]$, ${\bf B}$ be a $m\times1$ vector with elements $[b_{i}]$, and ${\bf C}$ be a $p\times q$ matrix with elements $[c_{ij}]$.
It is important to note that matrix calculus can be confusing due to notational conventions that are used in various fields. There are two main conventions (although the two are sometimes mixed by some authors) that are based how to take a derivative with respect to a vector. One convention is the
In all the results that follow, let $d$ be a scalar, ${\bf A}$ be a $n\times1$ vector with elements $[a_{i}]$, ${\bf B}$ be a $m\times1$ vector with elements $[b_{i}]$, and ${\bf C}$ be a $p\times q$ matrix with elements $[c_{ij}]$.
\begin{align*}
\frac{\partial{\bf A}}{\partial d} & =\left[\begin{array}{c}
\frac{\partial a_{1}}{\partial d}\\
\frac{\partial a_{2}}{\partial d}\\
\vdots\\
\frac{\partial a_{n}}{\partial d}
\end{array}\right]
\end{align*}
\begin{align*}
\frac{\partial d}{\partial{\bf A}} =\left[\begin{array}{cccc}
\frac{\partial d}{\partial a_{1}} & \frac{\partial d}{\partial a_{2}} & \cdots & \frac{\partial d}{\partial a_{n}}\end{array}\right]
\end{align*}
\begin{align*}
\frac{\partial{\bf A}}{\partial{\bf B}} & =\left[\begin{array}{cccc}
\frac{\partial a_{1}}{\partial b_{1}} & \frac{\partial a_{1}}{\partial b_{2}} & \cdots & \frac{\partial a_{1}}{\partial b_{m}}\\
\frac{\partial a_{2}}{\partial b_{1}} & \frac{\partial a_{2}}{\partial b_{2}} & \cdots & \frac{\partial a_{2}}{\partial b_{m}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial a_{n}}{\partial b_{1}} & \frac{\partial a_{n}}{\partial b_{2}} & \cdots & \frac{\partial a_{n}}{\partial b_{m}}
\end{array}\right]
\end{align*}
\begin{align*}
\frac{\partial{\bf C}}{\partial d} & =\left[\begin{array}{cccc}
\frac{\partial c_{11}}{\partial d} & \frac{\partial c_{12}}{\partial d} & \cdots & \frac{\partial c_{1q}}{\partial d}\\
\frac{\partial c_{21}}{\partial d} & \frac{\partial c_{22}}{\partial d} & \cdots & \frac{\partial c_{2q}}{\partial d}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial c_{p1}}{\partial d} & \frac{\partial c_{p2}}{\partial d} & \cdots & \frac{\partial c_{pq}}{\partial d}
\end{array}\right]
\end{align*}
\begin{align*}
\frac{\partial d}{\partial{\bf C}} & =\left[\begin{array}{cccc}
\frac{\partial d}{\partial c_{11}} & \frac{\partial d}{\partial c_{21}} & \cdots & \frac{\partial d}{\partial c_{p1}}\\
\frac{\partial d}{\partial c_{12}} & \frac{\partial d}{\partial c_{22}} & \cdots & \frac{\partial d}{\partial c_{p2}}\\
\vdots & \vdots & \ddots & \vdots\\
\frac{\partial d}{\partial c_{1q}} & \frac{\partial d}{\partial c_{2q}} & \cdots & \frac{\partial d}{\partial c_{pq}}
\end{array}\right]
\end{align*}
\begin{align*}
& \frac{\partial{\bf A}^{\prime}{\bf A}}{\partial{\bf A}}=2{\bf A}^{\prime}\\
& \frac{\partial{\bf A}^{\prime}{\bf B}}{\partial{\bf B}}=\frac{\partial{\bf B}^{\prime}{\bf A}}{\partial{\bf B}}={\bf A}^{\prime} & & \text{(provided }m=n)\\
& \frac{\partial{\bf \left({\bf A}^{\prime}{\bf B}\right)^{2}}}{\partial{\bf A}}=2{\bf A}^{\prime}{\bf B}{\bf B}^{\prime} & & \text{(provided }m=n)\\
& \frac{\partial{\bf C}{\bf A}}{\partial{\bf A}}={\bf C} & & \text{(provided }q=n)\\
& \frac{\partial{\bf A}^{\prime}{\bf C}}{\partial{\bf A}}={\bf C}^{\prime} & & \text{(provided }p=n)\\
& \frac{\partial{\bf A}^{\prime}{\bf C}{\bf A}}{\partial{\bf A}}={\bf A}^{\prime}\left({\bf C}+{\bf C}^{\prime}\right) & & \text{(provided }n=p=q)
\end{align*}
A random matrix contains elements that are random variables.
Thus, the vector of the response vector \begin{align*} {\bf Y} & =\left[\begin{array}{c} Y_{1}\\ Y_{2}\\ \vdots\\ Y_{n} \end{array}\right] \end{align*} is a random vector since the $Y_i$ elements are random variables.
Thus, the vector of the response vector \begin{align*} {\bf Y} & =\left[\begin{array}{c} Y_{1}\\ Y_{2}\\ \vdots\\ Y_{n} \end{array}\right] \end{align*} is a random vector since the $Y_i$ elements are random variables.
The expected value of ${\bf Y}$ is a matrix (or vector) that has
elements that are the expected values of the elements of ${\bf Y}$. Thus,
\begin{align*}
{\bf E}\left[{\bf Y}\right] & =\left[\begin{array}{c}
E\left[Y_{1}\right]\\
E\left[Y_{2}\right]\\
\vdots\\
E\left[Y_{n}\right]
\end{array}\right]
\end{align*}
When working with random vectors, we will be interested in the variance
of the individual elements
\begin{align*}
Var\left[Y_{i}\right]
\end{align*}
along with the covariance between pairs of elements
\begin{align*}
Cov\left[Y_{i},Y_{j}\right] & \text{ }i\ne j.
\end{align*}
All of these variances and covariances are given in the variance -covariance
matrix or simply covariance matrix:
\begin{align*}
{\bf Cov}\left[{\bf Y}\right] & =\left[\begin{array}{cccc}
Var\left[Y_{1}\right] & Cov\left[Y_{1},Y_{2}\right] & \cdots & Cov\left[Y_{1},Y_{n}\right]\\
Cov\left[Y_{2},Y_{1}\right] & Var\left[Y_{2}\right] & \cdots & Cov\left[Y_{2},Y_{n}\right]\\
\vdots & \vdots & \ddots & \vdots\\
Cov\left[Y_{n},Y_{1}\right] & Cov\left[Y_{n},Y_{2}\right] & \cdots & Var\left[Y_{n}\right]
\end{array}\right]
\end{align*}
Note that ${\bf Cov}\left[{\bf Y}\right]$ is a symmetric matrix since
$Cov\left[Y_{i},Y_{j}\right]=Cov\left[Y_{j},Y_{i}\right]$.