Fill in Blanks
Home
1.1 Bivariate Relationships
1.2 Probabilistic Models
1.3 Estimation of the Line
1.4 Properties of the Least Squares Estimators
1.5 Estimation of the Variance
2.1 The Normal Errors Model
2.2 Inferences for the Slope
2.3 Inferences for the Intercept
2.4 Correlation and Coefficient of Determination
2.5 Estimating the Mean Response
2.6 Predicting the Response
3.1 Residual Diagnostics
3.2 The Linearity Assumption
3.3 Homogeneity of Variance
3.4 Checking for Outliers
3.5 Correlated Error Terms
3.6 Normality of the Residuals
4.1 More Than One Predictor Variable
4.2 Estimating the Multiple Regression Model
4.3 A Primer on Matrices
4.4 The Regression Model in Matrix Terms
4.5 Least Squares and Inferences Using Matrices
4.6 ANOVA and Adjusted Coefficient of Determination
4.7 Estimation and Prediction of the Response
5.1 Multicollinearity and Its Effects
5.2 Adding a Predictor Variable
5.3 Outliers and Influential Cases
5.4 Residual Diagnostics
5.5 Remedial Measures
1.4 Properties of the Least Squares Estimators
"The most important questions of life are, for the most part, really only problems of probability."
- Pierre Simon, Marquis de Laplace
We first note that the least squares estimators in (1.4)
linear functions of the observations $Y_{1},\ldots,Y_{n}$. That is, both
$b_{0}$ and $b_{1}$ can be written as a linear
combination of the $Y$'s.
Since $Y$ is the variable that we want to model in (1.1)linear estimator .
\begin{align*}
b_{0} & =\bar{Y}-b_{1}\bar{X}\\
b_{1} & =\frac{\sum \left(X_{i}-\bar{X}\right)\left(Y_{i}-\bar{Y}\right)}{\sum \left(X_{i}-\bar{X}\right)^{2}}\qquad\qquad\qquad(1.4)
\end{align*}
are Since $Y$ is the variable that we want to model in (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
, we call an estimator for some parameter that takes
the form of a linear combination of $Y$ a
We first note that $\sum\left(X_{i}-\bar{X}\right)=0$.
The proof can be found here
We now rewrite $b_{1}$ as \begin{align*} b_{1} & =\frac{\sum\left(X_{i}-\bar{X}\right)\left(Y_{i}-\bar{Y}\right)}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\frac{\sum\left(X_{i}-\bar{X}\right)Y_{i}-\bar{Y}\sum\left(X_{i}-\bar{X}\right)}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\frac{\sum\left(X_{i}-\bar{X}\right)Y_{i}}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\left(\frac{1}{\sum\left(X_{i}-\bar{X}\right)^{2}}\right)\sum\left(X_{i}-\bar{X}\right)Y_{i} \end{align*}
Thus, we can write $b_{1}$ as \begin{align*} b_{1} & =\sum k_{i}Y_{i}\qquad\qquad\qquad(1.5) \end{align*} where \begin{align*} k_{i} & =\frac{X_{i}-\bar{X}}{\sum \left(X_{i}-\bar{X}\right)^{2}} \end{align*} From (1.5), we see that $b_{1}$ is a linear combination of the $Y$'s since $k_{i}$ are known constants (recall that $X_{i}$ are treated as a known constants).
\begin{align*}
\sum \left(X_{i}-\bar{X}\right) & =\sum X_{i}-\sum \bar{X}\\
& =\sum X_{i}-n\bar{X}\\
& =\sum X_{i}-n\frac{1}{n}\sum X_{i}\\
& =\sum X_{i}-\sum X_{i}\\
& =0.
\end{align*}
.
We now rewrite $b_{1}$ as \begin{align*} b_{1} & =\frac{\sum\left(X_{i}-\bar{X}\right)\left(Y_{i}-\bar{Y}\right)}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\frac{\sum\left(X_{i}-\bar{X}\right)Y_{i}-\bar{Y}\sum\left(X_{i}-\bar{X}\right)}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\frac{\sum\left(X_{i}-\bar{X}\right)Y_{i}}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\ & =\left(\frac{1}{\sum\left(X_{i}-\bar{X}\right)^{2}}\right)\sum\left(X_{i}-\bar{X}\right)Y_{i} \end{align*}
Thus, we can write $b_{1}$ as \begin{align*} b_{1} & =\sum k_{i}Y_{i}\qquad\qquad\qquad(1.5) \end{align*} where \begin{align*} k_{i} & =\frac{X_{i}-\bar{X}}{\sum \left(X_{i}-\bar{X}\right)^{2}} \end{align*} From (1.5), we see that $b_{1}$ is a linear combination of the $Y$'s since $k_{i}$ are known constants (recall that $X_{i}$ are treated as a known constants).
We can rewrite $b_{0}$ as
\begin{align*}
b_{0} & =\bar{Y}-b_{1}\bar{X}\\
& =\frac{1}{n}\sum Y_{i}-\bar{X}\sum k_{i}Y_{i}\\
& =\sum\left(\frac{1}{n}-\bar{X}k_{i}\right)Y_{i}
\end{align*}
Thus, we can write $b_{0}$ as \begin{align*} b_{0} & =\sum c_{i}Y_{i}\qquad\qquad\qquad(1.6) \end{align*} where \begin{align*} c_{i} & =\frac{1}{n}-\bar{X}k_{i} \end{align*} Therefore, $b_{0}$ is a linear combination of $Y_{i}$.
Thus, we can write $b_{0}$ as \begin{align*} b_{0} & =\sum c_{i}Y_{i}\qquad\qquad\qquad(1.6) \end{align*} where \begin{align*} c_{i} & =\frac{1}{n}-\bar{X}k_{i} \end{align*} Therefore, $b_{0}$ is a linear combination of $Y_{i}$.
The coefficients $k_{i}$ have the following properties:
\begin{align*}
\sum k_{i} & =0 & \qquad\qquad\qquad(1.7)\\
\sum k_{i}X_{i} & =1 & \qquad\qquad\qquad(1.8)\\
\sum k_{i}^{2} & =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}} & \qquad\qquad\qquad(1.9)
\end{align*}
The proof of (1.7) can be found here
The proof of (1.8) can be found here
The proof of (1.9) can be found here
Likewise, the coefficients $c_{i}$ have the following properties: \begin{align*} \sum c_{i} & =1 & \qquad\qquad\qquad(1.10)\\ \sum c_{i}X_{i} & =0 & \qquad\qquad\qquad(1.11)\\ \sum c_{i}^{2} & =\frac{1}{n}+\frac{\left(\bar{X}\right)^{2}}{\sum \left(X_{i}-\bar{X}\right)^{2}} & \qquad\qquad\qquad(1.12) \end{align*} The proof of (1.10) can be found here
The proof of (1.11) can be found here
The proof of (1.12) can be found here
These properties will be used to find the expectations and variances of $b_{1}$ and $b_{0}$.
\begin{align*}
\sum k_{i} & =\sum \frac{\left(X_{i}-\bar{X}\right)}{\sum \left(X_{i}-\bar{X}\right)^{2}}\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\underbrace{\sum \left(X_{i}-\bar{X}\right)}_{=0}\\
& =0
\end{align*}
.The proof of (1.8) can be found here
\begin{align*}
\sum k_{i}X_{i} & =\sum \frac{\left(X_{i}-\bar{X}\right)}{\sum \left(X_{i}-\bar{X}\right)^{2}}X_{i}\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\sum \left(X_{i}-\bar{X}\right)X_{i}\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\left[\sum X_{i}^{2}-\bar{X}\sum X_{i}\right]\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\left[\sum X_{i}^{2}-\bar{X}\sum X_{i}\underbrace{-\bar{X}\sum X_{i}+\bar{X}\sum X_{i}}_{\text{completing the square}}\right]\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\left[\sum X_{i}^{2}-2\bar{X}\sum X_{i}+\bar{X}n\left(\frac{1}{n}\sum X_{i}\right)\right]\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\left[\sum X_{i}^{2}-2\bar{X}\sum X_{i}+n\left(\bar{X}\right)^{2}\right]\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}\left[\sum \left(X_{i}-\bar{X}\right)^{2}\right]\\
& =1
\end{align*}
.The proof of (1.9) can be found here
\begin{align*}
\sum k_{i}^{2} & =\sum \left(\frac{\left(X_{i}-\bar{X}\right)}{\sum \left(X_{i}-\bar{X}\right)^{2}}\right)^{2}\\
& =\frac{1}{\left(\sum \left(X_{i}-\bar{X}\right)^{2}\right)^{2}}\sum \left(X_{i}-\bar{X}\right)^{2}\\
& =\frac{1}{\sum \left(X_{i}-\bar{X}\right)^{2}}
\end{align*}
.Likewise, the coefficients $c_{i}$ have the following properties: \begin{align*} \sum c_{i} & =1 & \qquad\qquad\qquad(1.10)\\ \sum c_{i}X_{i} & =0 & \qquad\qquad\qquad(1.11)\\ \sum c_{i}^{2} & =\frac{1}{n}+\frac{\left(\bar{X}\right)^{2}}{\sum \left(X_{i}-\bar{X}\right)^{2}} & \qquad\qquad\qquad(1.12) \end{align*} The proof of (1.10) can be found here
\begin{align*}
\sum c_{i} & =\sum \left(\frac{1}{n}-\bar{X}k_{i}\right)\\
& =\sum \frac{1}{n}-\bar{X}\underbrace{\sum k_{i}}_{(1.7)}\\
& =\frac{n}{n}\\
& =1
\end{align*}
.The proof of (1.11) can be found here
\begin{align*}
\sum c_{i}X_{i} & =\sum \left(\frac{1}{n}-\bar{X}k_{i}\right)X_{i}\\
& =\frac{1}{n}\sum X_{i}-\bar{X}\underbrace{\sum k_{i}X_{i}}_{(1.8)}\\
& =\bar{X}-\bar{X}\\
& =0
\end{align*}
.The proof of (1.12) can be found here
\begin{align*}
\sum c_{i}^{2} & =\sum \left(\frac{1}{n}-\bar{X}k_{i}\right)^{2}\\
& =\sum \frac{1}{n^{2}}-2\frac{1}{n}\bar{X}\underbrace{\sum k_{i}}_{(1.7)}+\left(\bar{X}\right)^{2}\underbrace{\sum k_{i}^{2}}_{(1.9)}\\
& =\frac{1}{n}+\frac{\left(\bar{X}\right)^{2}}{\sum \left(X_{i}-\bar{X}\right)^{2}}
\end{align*}
.These properties will be used to find the expectations and variances of $b_{1}$ and $b_{0}$.
Before finding the expectations, recall $E\left[Y_{i}\right]=\beta_{0}+\beta_{1}X_{i}$
from Section 1.2.3.
The expected value of $b_{1}$ is
\begin{align*}
E\left[b_{1}\right] & =E\left[\underbrace{\sum k_{i}Y_{i}}_{(1.5)}\right]\\
& =\sum k_{i}\left(\beta_{0}+\beta_{1}X_{i}\right)\\
& =\beta_{0}\underbrace{\sum k_{i}}_{(1.7)}+\beta_{1}\underbrace{\sum k_{i}X_{i}}_{(1.8)}\\
& =\beta_{1}
\end{align*}
The expected value of $b_{0}$ is
\begin{align*}
E\left[b_{0}\right] & =E\left[\underbrace{\sum c_{i}Y_{i}}_{(1.6)}\right]\\
& =\sum c_{i}\left(\beta_{0}+\beta_{1}X_{i}\right)\\
& =\beta_{0}\underbrace{\sum c_{i}}_{(1.10)}+\beta_{1}\underbrace{\sum c_{i}X_{i}}_{(1.11)}\\
& =\beta_{0}
\end{align*}
To find the variances, we will use a result from mathematical statistics:
Let $Y_{1},\ldots,Y_{n}$ be uncorrelated random variables and let $a_{1},\ldots,a_{n}$ be constants. Then \begin{align*} Var\left[\sum a_{i}Y_{i}\right] & =\sum a_{i}^{2}Var\left[Y_{i}\right]\qquad\qquad\qquad(1.13) \end{align*} Recall that in model (1.1)uncorrelated .
Also, recall that $Var[Y_i]=\sigma^2$ from Section 1.2.3.
Let $Y_{1},\ldots,Y_{n}$ be uncorrelated random variables and let $a_{1},\ldots,a_{n}$ be constants. Then \begin{align*} Var\left[\sum a_{i}Y_{i}\right] & =\sum a_{i}^{2}Var\left[Y_{i}\right]\qquad\qquad\qquad(1.13) \end{align*} Recall that in model (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
, we assume the response variables $Y_{i}$'s
are Also, recall that $Var[Y_i]=\sigma^2$ from Section 1.2.3.
The variance of $b_{1}$ is
\begin{align*}
Var\left[b_{1}\right] & =Var\left[\underbrace{\sum k_{i}Y_{i}}_{(1.5)}\right]\\
& =\underbrace{\sum k_{i}^{2}}_{(1.9)}Var\left[Y_{i}\right]\\
& =\frac{\sigma^{2}}{\sum \left(X_{i}-\bar{X}\right)^{2}}\qquad\qquad\qquad(1.14)
\end{align*}
The variance of $b_{0}$ is
\begin{align*}
Var\left[b_{0}\right] & =Var\left[\underbrace{\sum c_{i}Y_{i}}_{(1.6)}\right]\\
& =\underbrace{\sum c_{i}^{2}}_{(1.12)}Var\left[Y_{i}\right]\\
& =\sigma^{2}\left[\frac{1}{n}+\frac{\left(\bar{X}\right)^{2}}{\sum \left(X_{i}-\bar{X}\right)^{2}}\right]\qquad\qquad\qquad(1.15)
\end{align*}
We see from (1.5)
Any estimator for $\beta_{1}$, which we will denote as $\tilde{\beta}_{0}$, that takes the form \begin{align*} \tilde{\beta}_{1} & =\sum a_{i}Y_{i} \end{align*} where $a_{i}$ is some constants, is called alinear estimator .
For all linear estimators that are unbiased, we must have \begin{align*} E\left[\tilde{\beta}_{1}\right] & =E\left[\sum a_{i}Y_{i}\right]\\ & =\sum a_{i}E\left[Y_{i}\right]\\ & =\beta_{1} \end{align*} Since $E\left[Y_{i}\right]=\beta_{0}+\beta_{1}X_{i}$ from Section Section 1.2.3., then we must have \begin{align*} E\left[\tilde{\beta}_{1}\right] & =\sum a_{i}\left(\beta_{0}+\beta_{1}X_{i}\right)\\ & =\beta_{0}\sum a_{i}+\beta_{1}\sum a_{i}X_{i}\\ & =\beta_{1} \end{align*} Therefore, \begin{align*} \sum a_{i} & =0\\ \sum a_{i}X_{i} & =1 \end{align*} We now examine the variance of $\tilde{\beta}_{1}$: \begin{align*} Var\left[\tilde{\beta}_{1}\right] & =\sum a_{i}^{2}Var\left[Y_{i}\right]\\ & =\sigma^{2}\sum a_{i}^{2} \end{align*} Let's now define $a_{i}=k_{i}+d_{i}$ where $k_{i}$ is defined in (1.5)
and $d_{i}$ is some arbitrary constant.
We will show that adding a constant (whether negative or positive) to $k_i$ cannot make the variance smaller. Thus, the smallest variance of the linear estimator $\tilde{\beta}_1$ is when $a_i=k_i$.
The variance of $\tilde{\beta}_{1}$ can now be written as \begin{align*} Var\left[\tilde{\beta}_{1}\right] & =\sigma^{2}\sum a_{i}^{2}\\ & =\sigma^{2}\sum\left(k_{i}+d_{i}\right)^{2}\\ & =\sigma^{2}\sum\left(k_{i}^{2}+2k_{i}d_{i}+d_{i}^{2}\right)\\ & =Var\left[b_{1}\right]+2\sigma^{2}\sum k_{i}d_{i}+\sigma^{2}\sum d_{i}^{2} \end{align*} Examining the second term and using the expression of $k_{i}$ in (1.5)
Thus the unbiased linear estimator with the smallest variance is when $a_{i}=k_{i}$. That is, the least squares estimator $b_{1}$ in (1.5)
A similar argument can be used to show that $b_{0}$ in (1.6)
These arguments lead us to the following Theorem:
Theorem 1.1 Gauss Markov theorem:
For the model in (1.1)
An estimator that is linear, unbiased, and has the smallest variance of all unbiased linear estimators is called thebest linear unbiased estimator (BLUE).
\begin{align*}
b_{1} & =\sum k_{i}Y_{i}\qquad\qquad\qquad(1.5)
\end{align*}
where
\begin{align*}
k_{i} & =\frac{X_{i}-\bar{X}}{\sum \left(X_{i}-\bar{X}\right)^{2}}
\end{align*}
and (1.6)
\begin{align*}
b_{0} & =\sum c_{i}Y_{i}\qquad\qquad\qquad(1.6)
\end{align*}
where
\begin{align*}
c_{i} & =\frac{1}{n}-\bar{X}k_{i}
\end{align*}
that $b_{0}$ and $b_{1}$
are linear estimators.
Any estimator for $\beta_{1}$, which we will denote as $\tilde{\beta}_{0}$, that takes the form \begin{align*} \tilde{\beta}_{1} & =\sum a_{i}Y_{i} \end{align*} where $a_{i}$ is some constants, is called a
For all linear estimators that are unbiased, we must have \begin{align*} E\left[\tilde{\beta}_{1}\right] & =E\left[\sum a_{i}Y_{i}\right]\\ & =\sum a_{i}E\left[Y_{i}\right]\\ & =\beta_{1} \end{align*} Since $E\left[Y_{i}\right]=\beta_{0}+\beta_{1}X_{i}$ from Section Section 1.2.3., then we must have \begin{align*} E\left[\tilde{\beta}_{1}\right] & =\sum a_{i}\left(\beta_{0}+\beta_{1}X_{i}\right)\\ & =\beta_{0}\sum a_{i}+\beta_{1}\sum a_{i}X_{i}\\ & =\beta_{1} \end{align*} Therefore, \begin{align*} \sum a_{i} & =0\\ \sum a_{i}X_{i} & =1 \end{align*} We now examine the variance of $\tilde{\beta}_{1}$: \begin{align*} Var\left[\tilde{\beta}_{1}\right] & =\sum a_{i}^{2}Var\left[Y_{i}\right]\\ & =\sigma^{2}\sum a_{i}^{2} \end{align*} Let's now define $a_{i}=k_{i}+d_{i}$ where $k_{i}$ is defined in (1.5)
We will show that adding a constant (whether negative or positive) to $k_i$ cannot make the variance smaller. Thus, the smallest variance of the linear estimator $\tilde{\beta}_1$ is when $a_i=k_i$.
The variance of $\tilde{\beta}_{1}$ can now be written as \begin{align*} Var\left[\tilde{\beta}_{1}\right] & =\sigma^{2}\sum a_{i}^{2}\\ & =\sigma^{2}\sum\left(k_{i}+d_{i}\right)^{2}\\ & =\sigma^{2}\sum\left(k_{i}^{2}+2k_{i}d_{i}+d_{i}^{2}\right)\\ & =Var\left[b_{1}\right]+2\sigma^{2}\sum k_{i}d_{i}+\sigma^{2}\sum d_{i}^{2} \end{align*} Examining the second term and using the expression of $k_{i}$ in (1.5)
\begin{align*}
b_{1} & =\sum k_{i}Y_{i}\qquad\qquad\qquad(1.5)
\end{align*}
where
\begin{align*}
k_{i} & =\frac{X_{i}-\bar{X}}{\sum \left(X_{i}-\bar{X}\right)^{2}}
\end{align*}
, we see that
\begin{align*}
\sum k_{i}d_{i} & =\sum k_{i}\left(a_{i}-k_{i}\right)\\
& =\sum a_{i}k_{i}-\underbrace{\sum k_{i}^{2}}_{(1.9)}\\
& =\sum a_{i}\frac{X_{i}-\bar{X}}{\sum\left(X_{i}-\bar{X}\right)^{2}}-\frac{1}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\
& =\frac{\sum a_{i}X_{i}-\bar{X}\sum a_{i}}{\sum\left(X_{i}-\bar{X}\right)^{2}}-\frac{1}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\
& =\frac{1-\bar{X}\left(0\right)}{\sum\left(X_{i}-\bar{X}\right)^{2}}-\frac{1}{\sum\left(X_{i}-\bar{X}\right)^{2}}\\
& =0
\end{align*}
We now have the variance of $\tilde{\beta}_{1}$ as
\begin{align*}
Var\left[\tilde{\beta}_{1}\right] & =Var\left[b_{1}\right]+\sigma^{2}\sum d_{i}^{2}
\end{align*}
This variance is minimized when $\sum d_{i}^{2}=0$ which only happens
when $d_{i}=0$.
Thus the unbiased linear estimator with the smallest variance is when $a_{i}=k_{i}$. That is, the least squares estimator $b_{1}$ in (1.5)
\begin{align*}
b_{1} & =\sum k_{i}Y_{i}\qquad\qquad\qquad(1.5)
\end{align*}
where
\begin{align*}
k_{i} & =\frac{X_{i}-\bar{X}}{\sum \left(X_{i}-\bar{X}\right)^{2}}
\end{align*}
has the smallest variance of all unbiased
linear estimators of $\beta_{1}$.
A similar argument can be used to show that $b_{0}$ in (1.6)
\begin{align*}
b_{0} & =\sum c_{i}Y_{i}\qquad\qquad\qquad(1.6)
\end{align*}
where
\begin{align*}
c_{i} & =\frac{1}{n}-\bar{X}k_{i}
\end{align*}
has the smallest variance of all unbiased
linear estimators of $\beta_{0}$.
These arguments lead us to the following Theorem:
Theorem 1.1 Gauss Markov theorem:
For the model in (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
, the least squares
estimators $b_0$ and $b_1$ in (1.4)
\begin{align*}
b_{0} & =\bar{Y}-b_{1}\bar{X}\\
b_{1} & =\frac{\sum \left(X_{i}-\bar{X}\right)\left(Y_{i}-\bar{Y}\right)}{\sum \left(X_{i}-\bar{X}\right)^{2}}\qquad\qquad\qquad(1.4)
\end{align*}
are unbiased and have minimum
variance among all unbiased linear estimators.
An estimator that is linear, unbiased, and has the smallest variance of all unbiased linear estimators is called the
We now have the expectations and variances of the least squares estimators
$b_{0}$ and $b_{1}$. We next examine the sampling
distribution of these estimators.
In model (1.1)shape of its distribution.
In Section 2.1, we will make an assumption about the shape which will allow us to determine the shape of thesampling distributions of $b_{0}$
and $b_{1}$.
For model (1.1)
In model (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
, we did not make any assumptions about the distribution
of $\varepsilon$ other than its mean and variance and that they are
uncorrelated. We did not assume anything about the In Section 2.1, we will make an assumption about the shape which will allow us to determine the shape of the
For model (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
, the shape of the distribution of $\varepsilon$
is not specified so we cannot determine the shape of the sampling
distributions of $b_{0}$ and $b_{1}$. We can
approximate these sampling distributions by either repeated sampling
from the population of interest or using a techniques such as the
bootstrap.
If we were to repeatedly take a sample of size $n$ from the population of interest and then find the least squares estimates each time, we could plot these estimates to estimate their sampling distributions. This is known as repeated sampling .
For example, let's consider the handspan and height measurements from Section 1.1.1. These measurements were from 834 college students. Suppose these students are now the population of interest.
We will take a random sample of $n=30$ from this population of 834 college students. We could look at all possible samples of size $n=30$ and the resulting least squares estimates $b_0$ and $b_1$. This would give us the exact sampling distribution for each. In this example with a relatively small population size of 834, the number of possible samples of size 30 is \begin{align*} {834 \choose 30} & =9.596\times10^{54} \end{align*}
It would be infeasible to examine this many samples, however, we could look at enough samples (tens of thousands) to get an estimate of the sampling distributions.
In Figure 1.4.1 below, we see the scatterplot of all 834 population measurements on the left. They are colored gray. The least squares line for the entire population is shown in blue.
A random sample of size $n=30$ is shown as red points in the scatterplot. The red line represents the fitted line for the sample.
The plots on the right show the histograms of the least squares estimates $b_0$ and $b_1$ from the repeated sampling. Clicking on the buttons above the plot will conduct the random sampling and update the histograms. The blue and red vertical lines in the histograms represent the population parameters and the mean of the estimates, respectively.
We can get a good estimate of the sampling distributions by examining just a few ten thousands of samples. The downside to using repeated sampling is that we usually only have one sample. Thus, we need a different approach that will allow us to estimate the sampling distributions with just the information in our one sample. We will explore one way to do this next.
For example, let's consider the handspan and height measurements from Section 1.1.1. These measurements were from 834 college students. Suppose these students are now the population of interest.
We will take a random sample of $n=30$ from this population of 834 college students. We could look at all possible samples of size $n=30$ and the resulting least squares estimates $b_0$ and $b_1$. This would give us the exact sampling distribution for each. In this example with a relatively small population size of 834, the number of possible samples of size 30 is \begin{align*} {834 \choose 30} & =9.596\times10^{54} \end{align*}
It would be infeasible to examine this many samples, however, we could look at enough samples (tens of thousands) to get an estimate of the sampling distributions.
In Figure 1.4.1 below, we see the scatterplot of all 834 population measurements on the left. They are colored gray. The least squares line for the entire population is shown in blue.
A random sample of size $n=30$ is shown as red points in the scatterplot. The red line represents the fitted line for the sample.
The plots on the right show the histograms of the least squares estimates $b_0$ and $b_1$ from the repeated sampling. Clicking on the buttons above the plot will conduct the random sampling and update the histograms. The blue and red vertical lines in the histograms represent the population parameters and the mean of the estimates, respectively.
Population line:
$Y_i=\beta_0+\beta_1 X_i$
$Y_i=-3.1322+0.3457 X_i$
$Y_i=\beta_0+\beta_1 X_i$
$Y_i=-3.1322+0.3457 X_i$
Least squares line of last sample:
$\hat{Y}_i=b_0+b_1 X_i$
$\hat{Y}_i=b_0+b_1 X_i$
Number of samples:
Mean of the $b_0$'s:
Mean of the $b_1$'s:
Mean of the $b_0$'s:
Mean of the $b_1$'s:
Figure 1.4.1: Sampling Distributions of $b_0$ and $b_1$ by Repeated Sampling
We can get a good estimate of the sampling distributions by examining just a few ten thousands of samples. The downside to using repeated sampling is that we usually only have one sample. Thus, we need a different approach that will allow us to estimate the sampling distributions with just the information in our one sample. We will explore one way to do this next.
This estimation is done by sampling
Suppose we only had a sample of size $n=30$ from the handspan and height data. This sample is plotted in Figure 1.4.2 below with the least squares line (both in red).
A sample of size $n=30$ is taken from the red dots, with replacement. This bootstrapped sample are the black dots. Note that some of the observations can be selected more than once due to sampling with replacement. This is why some of the points are still red.
The $b_0$ and $b_1$ for the bootstrap sample are then plotted in the histograms to the right. Clicking on the buttons above the figure will generate more bootstrap samples and estimates.
Sample line:
$\hat{Y}_i=-5.8796+0.3804 X_i$
Last bootstrap line:
$\hat{Y}_i=-5.8796+0.3804 X_i$
Last bootstrap line:
Number of bootstrap samples:
Mean of the $b_0$'s:
Mean of the $b_1$'s:
Mean of the $b_0$'s:
Mean of the $b_1$'s:
Figure 1.4.2: Sampling Distributions of $b_0$ and $b_1$ by Bootstrap Sampling
After generating a few thousand bootstrap estimates, we can get a good estimate of the sampling distributions of $b_0$ and $b_1$. Note, however, how these estimated distributions are centered at the least squares estimates of the observed sample (the red vertical lines) and not at the true population values (the blue vertical lines).
In Section 2.1, we will have an additional assumption in model (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
in which the distribution of the errors will be specified. This assumption will allow us to know the sampling distributions of $b_0$ and $b_1$ without the need of repeated sampling or bootstrap sampling.
We next discuss how to estimate the last parameter in model (1.1)
$$
Y_i=\beta_0+\beta_1X_i+\varepsilon_i\qquad\qquad\qquad(1.1)
$$
: the variance, $\sigma^2$, of the error term.