Fill in Blanks
Home

4.4 The Regression Model in Matrix Terms

"The essence of mathematics is not to make simple things complicated, but to make complicated things simple." - S. Gudder
Recall the normal error regression model (4.1) is \begin{align*} Y_{i}= & \beta_{0}+\beta_{1}X_{i1}+\beta_{2}X_{i2}+\cdots+\beta_{p-1}X_{i,p-1}+\varepsilon_{i}\\ & \varepsilon\overset{iid}{\sim}N\left(0,\sigma^{2}\right)\qquad\qquad\qquad\qquad\qquad\qquad(4.1) \end{align*} for $i=1,\ldots,n$.

This implies: $$ \begin{align*} Y_{1} & =\beta_{0}+\beta_{1}X_{11}+\beta_{2}X_{12}+\cdots+\beta_{p-1}X_{1,p-1}+\varepsilon_{1}\\ Y_{2} & =\beta_{0}+\beta_{1}X_{21}+\beta_{2}X_{22}+\cdots+\beta_{p-1}X_{2,p-1}+\varepsilon_{2}\\ & \vdots\\ Y_{n} & =\beta_{0}+\beta_{1}X_{n1}+\beta_{2}X_{n2}+\cdots+\beta_{p-1}X_{n,p-1}+\varepsilon_{n} \end{align*} $$
We define the response vector as $$ \begin{align*} \underset{n\times1}{\textbf{Y}}=\left[\begin{array}{c} Y_{1}\\ Y_{2}\\ \vdots\\ Y_{n} \end{array}\right] \end{align*} $$
We define the vector of random errors as $$ \begin{align*} \underset{n\times1}{\boldsymbol{\varepsilon}}=\left[\begin{array}{c} \varepsilon_{1}\\ \varepsilon_{2}\\ \vdots\\ \varepsilon_{n} \end{array}\right] \end{align*} $$
We define the vector of Coefficients as $$ \begin{align*} \underset{p\times1}{\boldsymbol{\beta}}=\left[\begin{array}{c} \beta_{0}\\ \beta_{1}\\ \vdots\\ \beta_{p-1} \end{array}\right] \end{align*} $$
We define the matrix of the predictor variables as $$ \begin{align*} \underset{n\times p}{\textbf{X}}=\left[\begin{array}{cc} 1 & X_{11} & X_{12} & \cdots & X_{1,p-1}\\ 1 & X_{21} & X_{22} & \cdots & X_{2,p-1}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & X_{n1} & X_{n2} & \cdots & X_{n,p-1} \end{array}\right] \end{align*} $$ Note that the first column of $\bf{X}$ is a vector of ones. This column will represent the coefficient of the y-intercept in the model.
We can now write the model as $$ \begin{align*} \underset{n\times1}{\textbf{Y}} & =\underset{n\times p}{\textbf{X}}\underset{p\times 1}{\boldsymbol{\beta}}+\underset{n\times1}{\boldsymbol{\varepsilon}} \end{align*} $$ since: $$ \begin{align*} \left[\begin{array}{c} Y_{1}\\ Y_{2}\\ \vdots\\ Y_{n} \end{array}\right] & =\left[\begin{array}{cc} 1 & X_{11} & X_{12} & \cdots & X_{1,p-1}\\ 1 & X_{21} & X_{22} & \cdots & X_{2,p-1}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & X_{n1} & X_{n2} & \cdots & X_{n,p-1} \end{array}\right]\left[\begin{array}{c} \beta_{0}\\ \beta_{1}\\ \vdots\\ \beta_{p-1} \end{array}\right]+\left[\begin{array}{c} \varepsilon_{1}\\ \varepsilon_{2}\\ \vdots\\ \varepsilon_{n} \end{array}\right]\\ & =\left[\begin{array}{c} \beta_{0}+\beta_{1}X_{11}+\beta_2X_{12}+\cdots+\beta_{p-1}X_{1,p-1}\\ \beta_{0}+\beta_{1}X_{21}+\beta_2X_{22}+\cdots+\beta_{p-1}X_{2,p-1}\\ \vdots\\ \beta_{0}+\beta_{1}X_{n1}+\beta_2X_{n2}+\cdots+\beta_{p-1}X_{n,p-1} \end{array}\right]+\left[\begin{array}{c} \varepsilon_{1}\\ \varepsilon_{2}\\ \vdots\\ \varepsilon_{n} \end{array}\right]\\ & =\left[\begin{array}{c} \beta_{0}+\beta_{1}X_{11}+\beta_2X_{12}+\cdots+\beta_{p-1}X_{1,p-1}+\varepsilon_{1}\\ \beta_{0}+\beta_{1}X_{21}+\beta_2X_{22}+\cdots+\beta_{p-1}X_{2,p-1}+\varepsilon_{2}\\ \vdots\\ \beta_{0}+\beta_{1}X_{n1}+\beta_2X_{n2}+\cdots+\beta_{p-1}X_{n,p-1}+\varepsilon_{n} \end{array}\right] \end{align*} $$
The assumption on the normal error model for the random error term is \begin{align*} \varepsilon\overset{iid}{\sim} & N\left(0,\sigma^{2}\right). \end{align*} In matrix notation, this can be expressed with the multivariate normal distribution.

Note that the univariate normal distribution has a probability density function expressed as \begin{align*} f\left(x\right) & =\frac{1}{\sigma\sqrt{2\pi}}\exp\left[-\frac{1}{2\sigma^{2}}\left(x-\mu\right)^{2}\right] \end{align*} where $\mu$ is the mean of the distribution and $\sigma$ is the standard deviation.

The multivariate normal distribution is expressed as \begin{align*} f\left({\bf Y}\right) & =\frac{1}{\left(2\pi\right)^{n/2}\left|\boldsymbol{\Sigma}\right|^{1/2}}\exp\left[-\frac{1}{2}\left({\bf Y}-\boldsymbol{\mu}\right)^{\prime}\boldsymbol{\Sigma}^{-1}\left({\bf Y}-\boldsymbol{\mu}\right)\right] \end{align*} where ${\bf Y}$ is a $n\times1$ vector, $\boldsymbol{\mu}$ is a $n\times1$ vector of means, and $\boldsymbol{\Sigma}$ is the $n\times n$ covariance matrix.

We denote the multivariate normal distribution of a random vector ${\bf Y}$ as \begin{align*} {\bf Y} & \sim N_{n}\left(\boldsymbol{\mu},\boldsymbol{\Sigma}\right). \end{align*} For the normal error model, the mean vector of the random vector $\boldsymbol{\varepsilon}$ is a vector of zeros (${\bf 0}$) and the covariance matrix is \begin{align*} {\bf Cov}\left(\boldsymbol{\varepsilon}\right) & =\left[\begin{array}{cccc} \sigma^{2} & 0 & \cdots & 0\\ 0 & \sigma^{2} & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & \sigma^{2} \end{array}\right]\\ & =\sigma^{2}{\bf I}. \end{align*}
We now represent the normal errors multiple regression model as $$ \begin{align*} {\textbf{Y}}= & {\textbf{X}}{\boldsymbol{\beta}}+{\boldsymbol{\varepsilon}}\\ & \boldsymbol{\varepsilon} \sim N_{n}\left({\bf 0},\sigma^{2}{\bf I}\right)\qquad (4.18) \end{align*} $$ Note that $$ \begin{align*} \textbf{E}\left(\textbf{Y}\right) & =\textbf{X}\boldsymbol{\beta} \end{align*} $$