Solution Manual For Introductory Econometrics: A Modern Approach, 6th Edition
Preview Extract
8
CHAPTER 2
The Simple Regression Model
Table of Contents
Teaching notes
Solutions to Problems
Solutions to Computer Exercises
9
10
17
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
9
TEACHING NOTES
This is the chapter where I expect students to follow most, if not all, of the algebraic derivations.
In class, I like to derive at least the unbiasedness of the OLS slope coefficient, and usually, I
derive the variance. At a minimum, I talk about the factors affecting the variance. To simplify
the notation, after I emphasize the assumptions in the population model, and assume random
sampling, I just condition on the values of the explanatory variables in the sample. Technically,
this is justified by random sampling because, for example, E(u i |x 1 , x 2 , โฆ, x n ) = E(u i |x i ) by
independent sampling. I find that students are able to focus on the key assumption SLR.4 and
subsequently take my word about how conditioning on the independent variables in the sample is
harmless. (If you prefer, the appendix to Chapter 3 does the conditioning argument carefully.)
Because statistical inference is no more difficult in multiple regression than in simple regression,
I postpone inference until Chapter 4. (This reduces redundancy and allows you to focus on the
interpretive differences between simple and multiple regression.)
You might notice how, compared with most other texts, I use relatively few assumptions to
derive the unbiasedness of the OLS slope estimator, followed by the formula for its variance.
This is because I do not introduce redundant or unnecessary assumptions. For example, once
SLR.4 is assumed, nothing further about the relationship between u and x is needed to obtain the
unbiasedness of OLS under random sampling.
Incidentally, one of the uncomfortable facts about finite-sample analysis is that there is a
difference between an estimator that is unbiased conditional on the outcome of the covariates and
one that is unconditionally unbiased. If the distribution of the ???? is such that they can all equal
the same value with positive probability โ as is the case with discreteness in the distribution โ
then the unconditional expectation does not really exist. Or, if it is made to exist, then the
estimator is not unbiased. I do not try to explain these subtleties in an introductory course, but I
have had instructors ask me about the difference.
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
10
SOLUTIONS TO PROBLEMS
2.1 (i) Income, age, and family background (such as number of siblings) are just a few
possibilities. It seems that each of these could be correlated with years of education. (Income
and education are probably positively correlated; age and education may be negatively correlated
because women in more recent cohorts have, on average, more education; and number of siblings
and education are probably negatively correlated.)
(ii) Not if the factors we listed in part (i) are correlated with educ. Because we would like to
hold these factors fixed, they are part of the error term. But if u is correlated with educ, then
E(u|educ) โ 0, and so SLR.4 fails.
2.2 In the equation y = ฮฒ 0 + ฮฒ 1 x + u, add and subtract ฮฑ 0 from the right hand side to get y = (ฮฑ 0
+ ฮฒ 0 ) + ฮฒ 1 x + (u โ ฮฑ 0 ). Call the new error e = u โ ฮฑ 0 , so that E(e) = 0. The new intercept is
ฮฑ 0 + ฮฒ 0 , but the slope is still ฮฒ 1 .
n
2.3 (i) Let y i = GPA i , x i = ACT i , and n = 8. Then x = 25.875, y = 3.2125, โ (x i โ x )(y i โ y ) =
i=1
n
5.8125, and โ (x i โ x )2 = 56.875. From equation (2.19), we obtain the slope as ฮฒฬ1 =
i=1
5.8125/56.875 โ .1022, rounded to four places after the decimal. From (2.17), ฮฒฬ 0 = y โ ฮฒฬ1 x
โ 3.2125 โ (.1022)25.875 โ .5681. So we can write
๏ท = .5681 + .1022 ACT
GPA
n = 8.
The intercept does not have a useful interpretation because ACT is not close to zero for the
๏ท increases by .1022(5) = .511.
population of interest. If ACT is 5 points higher, GPA
(ii) The fitted values and residuals โ rounded to four decimal places โ are given along with
the observation number i and GPA in the following table:
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
11
i GPA
1 2.8
๏ท
GPA
uฬ
2.7143
.0857
2 3.4
3.0209
.3791
3 3.0
3.2253 โ.2253
4 3.5
3.3275
.1725
5 3.6
3.5319
.0681
6 3.0
3.1231 โ.1231
7 2.7
3.1231 โ.4231
8 3.7
3.6341
.0659
You can verify that the residuals, as reported in the table, sum to โ.0002, which is pretty close to
zero given the inherent rounding error.
๏ท = .5681 + .1022(20) โ 2.61.
(iii) When ACT = 20, GPA
n
(iv) The sum of squared residuals, โ uหi2 , is about .4347 (rounded to four decimal places),
i =1
n
and the total sum of squares, โ (y i โ y )2, is about 1.0288. So the R-squared from the regression
i=1
is
R2 = 1 โ SSR/SST โ 1 โ (.4347/1.0288) โ .577.
Therefore, about 57.7% of the variation in GPA is explained by ACT in this small sample of
students.
๏ท = 109.49.
2.4 (i) When cigs = 0, predicted birth weight is 119.77 ounces. When cigs = 20, bwght
This is about an 8.6% drop.
(ii) Not necessarily. There are many other factors that can affect birth weight, particularly
overall health of the mother and quality of prenatal care. These could be correlated with
cigarette smoking during birth. Also, something such as caffeine consumption can affect birth
weight, and might also be correlated with cigarette smoking.
(iii) If we want a predicted bwght of 125, then cigs = (125 โ 119.77)/( โ.524) โ โ10.18, or
about โ10 cigarettes. This is nonsense, of course, and it shows what happens when we are trying
to predict something as complicated as birth weight with only a single explanatory variable. The
largest predicted birth weight is necessarily 119.77. Yet, almost 700 of the births in the sample
had a birth weight higher than 119.77.
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
12
(iv) 1,176 out of 1,388 women did not smoke while pregnant, or about 84.7%. Because we
are using only cigs to explain birth weight, we have only one predicted birth weight at cigs = 0.
The predicted birth weight is necessarily roughly in the middle of the observed birth weights at
cigs = 0, and so we will under predict high birth rates.
2.5 (i) The intercept implies that when inc = 0, cons is predicted to be negative $124.84. This, of
course, cannot be true, and reflects the fact that this consumption function might be a poor
predictor of consumption at very low-income levels. On the other hand, on an annual basis,
$124.84 is not so far from zero.
๏ท = โ124.84 + .853(30,000) = 25,465.16 dollars.
(ii) Just plug 30,000 into the equation: cons
(iii) The MPC and the APC are shown in the following graph. Even though the intercept is
negative, the smallest APC in the sample is positive. The graph starts at an annual income level
of $1,000 (in 1970 dollars).
MPC
APC
.9
MPC
.853
APC
.728
.7
1000
10000
20000
30000
inc
2.6 (i) Yes. If living closer to an incinerator depresses housing prices, then being farther away
increases housing prices.
(ii) If the city chooses to locate the incinerator in an area away from more expensive
neighborhoods, then log(dist) is positively correlated with housing quality. This would violate
SLR.4, and OLS estimation is biased.
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
13
(iii) Size of the house, number of bathrooms, size of the lot, age of the home, and quality of
the neighborhood (including school quality), are just a handful of factors. As mentioned in part
(ii), these could certainly be correlated with dist [and log(dist)].
2.7 (i) When we condition on inc in computing an expectation,
E(u|inc) = E( inc โ
e|inc) =
inc โ
E(e|inc) =
inc becomes a constant. So
inc โ
0 because E(e|inc) = E(e) = 0.
(ii) Again, when we condition on inc in computing a variance,
inc becomes a constant. So
Var(u|inc) = Var( inc โ
e|inc) = ( inc ) Var(e|inc) = ฯ inc because Var(e|inc) = ฯ e2 .
2
2
e
(iii) Families with low incomes do not have much discretion about spending; typically, a
low-income family must spend on food, clothing, housing, and other necessities. Higher-income
people have more discretion, and some might choose more consumption while others more
saving. This discretion suggests wider variability in saving among higher income families.
2.8 (i) From equation (2.66),
๏ฃซ n
๏ฃถ ๏ฃซ n 2๏ฃถ
๏ฅ
ฮฒ1 = ๏ฃฌ โ xi yi ๏ฃท / ๏ฃฌ โ xi ๏ฃท .
๏ฃญ i =1
๏ฃธ ๏ฃญ i =1 ๏ฃธ
Plugging in y i = ฮฒ 0 + ฮฒ 1 x i + u i gives
๏ฃซ n
๏ฃถ ๏ฃซ n
๏ฃถ
๏ฃญ i =1
๏ฃธ ๏ฃญ i =1
๏ฃธ
ฮฒ๏ฅ1 = ๏ฃฌ โ xi ( ฮฒ 0 + ฮฒ1 xi + ui ) ๏ฃท / ๏ฃฌ โ xi2 ๏ฃท .
After standard algebra, the numerator can be written as
n
n
n
ฮฒ 0 โ xi +ฮฒ1 โ x 2 + โ xi ui .
=i 1
i
=i 1 =i 1
Putting this over the denominator we can write ฮฒ๏ฅ1 as
๏ฃซ n
๏ฃถ ๏ฃซ n
๏ฃถ
๏ฃซ n
๏ฃถ ๏ฃซ n
๏ฃถ
๏ฃญ i =1
๏ฃธ ๏ฃญ i =1
๏ฃธ
๏ฃญ i =1
๏ฃธ ๏ฃญ i =1
๏ฃธ
ฮฒ๏ฅ1 = ฮฒ 0 ๏ฃฌ โ xi ๏ฃท / ๏ฃฌ โ xi2 ๏ฃท + ฮฒ 1 + ๏ฃฌ โ xi ui ๏ฃท / ๏ฃฌ โ xi2 ๏ฃท .
Conditional on the x i , we have
๏ฃซ n ๏ฃถ ๏ฃซ n
๏ฃถ
E( ฮฒ๏ฅ1 ) = ฮฒ 0 ๏ฃฌ โ xi ๏ฃท / ๏ฃฌ โ xi2 ๏ฃท + ฮฒ 1
๏ฃญ i =1 ๏ฃธ ๏ฃญ i =1 ๏ฃธ
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
14
because E(u i ) = 0 for all i. Therefore, the bias in ฮฒ๏ฅ1 is given by the first term in this equation.
n
This bias is obviously zero when ฮฒ 0 = 0. It is also zero when โ xi = 0, which is the same as x
i =1
= 0. In the latter case, regression through the origin is identical to regression with an intercept.
(ii) From the last expression for ฮฒ๏ฅ1 in part (i) we have, conditional on the x i ,
โ2
โ2
๏ฃซ n
๏ฃถ
๏ฃถ ๏ฃซ n
๏ฃซ n
๏ฃถ ๏ฃซ n
๏ฃถ
Var( ฮฒ๏ฅ1 ) = ๏ฃฌ โ xi2 ๏ฃท Var ๏ฃฌ โ xi ui ๏ฃท = ๏ฃฌ โ xi2 ๏ฃท ๏ฃฌ โ xi2 Var(ui ) ๏ฃท
๏ฃญ i =1 ๏ฃธ
๏ฃญ i =1
๏ฃธ ๏ฃญ i =1 ๏ฃธ ๏ฃญ i =1
๏ฃธ
โ2
n
๏ฃซ n
๏ฃถ ๏ฃซ
๏ฃซ n
๏ฃถ
๏ฃถ
= ๏ฃฌ โ xi2 ๏ฃท ๏ฃฌ ฯ 2 โ xi2 ๏ฃท = ฯ 2 / ๏ฃฌ โ xi2 ๏ฃท .
๏ฃญ i =1 ๏ฃธ ๏ฃญ i =1 ๏ฃธ
๏ฃญ i =1 ๏ฃธ
n
n
๏ฃซ n
๏ฃถ
(iii) From (2.57), Var( ฮฒฬ1 ) = ๏ณ2/ ๏ฃฌ โ ( xi โ x ) 2 ๏ฃท . From the hint, โ xi2 โฅ โ ( xi โ x ) 2 , and so
i =1
i =1
๏ฃญ i =1
๏ฃธ
n
n
i =1
i =1
Var( ฮฒ๏ฅ1 ) โค Var( ฮฒฬ1 ). A more direct way to see this is to write โ ( xi โ x ) 2 = โ xi2 โ n( x ) 2 , which
n
is less than โ xi2 unless x = 0.
i =1
(iv) For a given sample size, the bias in ฮฒ๏ฅ1 increases as x increases (holding the sum of the
x 2 fixed). But as x increases, the variance of ฮฒฬ increases relative to Var( ฮฒ๏ฅ ). The bias in ฮฒ๏ฅ
i
1
1
1
is also small when ฮฒ 0 is small. Therefore, whether we prefer ฮฒ๏ฅ1 or ฮฒฬ1 on a mean squared error
n
basis depends on the sizes of ฮฒ 0 , x , and n (in addition to the size of โ xi2 ).
i =1
2.9 (i) We follow the hint, noting that c1 y = c1 y (the sample average of c1 yi is c 1 times the
sample average of y i ) and c2 x = c2 x . When we regress c 1 y i on c 2 x i (including an intercept), we
use equation (2.19) to obtain the slope:
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
15
From (2.17), we obtain the intercept as ฮฒ๏ฅ0 = (c 1 y ) โ ฮฒ๏ฅ1 (c 2 x ) = (c 1 y ) โ [(c 1 /c 2 ) ฮฒฬ1 ](c 2 x ) =
c 1 ( y โ ฮฒฬ1 x ) = c 1 ฮฒฬ 0 ) because the intercept from regressing y i on x i is ( y โ ฮฒฬ1 x ).
(ii) We use the same approach from part (i) along with the fact that (c1 + y ) = c 1 + y and
(c2 + x) = c 2 + x . Therefore, (c1 + yi ) โ (c1 + y ) = (c 1 + y i ) โ (c 1 + y ) = y i โ y and (c 2 + x i ) โ
(c2 + x) = x i โ x . So c 1 and c 2 entirely drop out of the slope formula for the regression of (c 1 +
y i ) on (c 2 + x i ), and ฮฒ๏ฅ = ฮฒฬ . The intercept is ฮฒ๏ฅ = (c + y ) โ ฮฒ๏ฅ (c + x) = (c 1 + y ) โ ฮฒฬ (c 2 +
1
1
0
1
1
2
1
x ) = ( y โ ฮฒห1 x ) + c 1 โ c 2 ฮฒฬ1 = ฮฒฬ 0 + c 1 โ c 2 ฮฒฬ1 , which is what we wanted to show.
(iii) We can simply apply part (ii) because log(=
c1 yi ) log(c1 ) + log( yi ) . In other words,
replace c 1 with log(c 1 ), replace y i with log(y i ), and set c 2 = 0.
(iv) Again, we can apply part (ii) with c 1 = 0 and replacing c 2 with log(c 2 ) and x i with
ฮฒห0 โ log(c2 ) ฮฒห1 .
log(x i ). If ฮฒห0 and ฮฒห1 are the original intercept and slope, then ฮฒ๏ฅ1 = ฮฒห1 and ฮฒ๏ฅ=
0
2.10 (i) This derivation is essentially done in equation (2.52), once (1/ SSTx ) is brought inside
the summation (which is valid because SSTx does not depend on i). Then, just define
wi = di / SSTx .
E[( ฮฒห1 โ ฮฒ1 )u ] , we show that the latter is zero. But, from part (i),
(ii) Because Cov( ฮฒห=
1, u )
)
(
n
n
๏ฃฎ
E[( ฮฒห1 โ ฮฒ=
wi ui u ๏ฃบ๏ฃน =
w E(ui u ). Because the ui are pairwise uncorrelated
โ
1 )u ] =E ๏ฃฏ โ i 1 =
i
1 i
๏ฃฐ
๏ฃป
(they are independent),=
0, i โ h ). Therefore,
E(ui u ) E(
=
ui2 / n) ฯ 2 / n (because E(ui u=
h)
2
=
wi E(ui u ) โ
=
w (ฯ 2 / n=
=
) (ฯ
/ n)โ i 1 wi 0.
โi 1=
=
i 1 i
n
n
n
and plugging in y =ฮฒ 0 + ฮฒ1 x + u
(iii) The formula for the OLS intercept is
gives ฮฒห = ( ฮฒ + ฮฒ x + u ) โ ฮฒห x = ฮฒ + u โ ( ฮฒห โ ฮฒ ) x .
0
0
1
1
0
1
1
(iv) Because ฮฒห1 and u are uncorrelated,
Var( ฮฒห ) =
Var(u ) + Var( ฮฒห ) x 2 =
ฯ 2 / n + (ฯ 2 / SST ) x 2 =
ฯ 2 / n + ฯ 2 x 2 / SST ,
0
x
1
x
which is what we wanted to show.
(v) Using the hint and substitution gives
=
Var( ฮฒห0 ) ฯ 2 [( SSTx / n ) + x 2 ] / SSTx
(
)
(
)
n
n
= ฯ 2 ๏ฃฎ๏ฃฏ n โ1 โ i 1 =
xi2 โ x 2 + =
x 2 ๏ฃน๏ฃบ / SSTx ฯ 2 n โ1 โ i 1 xi2 / SSTx .
๏ฃฐ
๏ฃป
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
16
2.11 (i) We would want to randomly assign the number of hours in the preparation course so that
hours is independent of other factors that affect performance on the SAT. Then, we would
collect information on SAT score for each student in the experiment, yielding a data set
{( sati , hoursi ) : i = 1,…, n} , where n is the number of students we can afford to have in the study.
From equation (2.7), we should try to get as much variation in hoursi as is feasible.
(ii) Here are three factors: innate ability, family income, and general health on the day of the
exam. If we think students with higher native intelligence think they do not need to prepare for
the SAT, then ability and hours will be negatively correlated. Family income would probably be
positively correlated with hours, because higher income families can more easily afford
preparation courses. Ruling out chronic health problems, health on the day of the exam should
be roughly uncorrelated with hours spent in a preparation course.
(iii) If preparation courses are effective, ฮฒ1 should be positive; other factors equal, an
increase in hours should increase sat.
(iv) The intercept, ฮฒ 0 , has a useful interpretation in this example: because E(u) = 0, ฮฒ 0 is the
average SAT score for students in the population with hours = 0.
2.12 (i) I will show the result without using calculus. Let ??๏ฟฝ be the sample average of the ???? and
write
n
)2
โ ( yi โ b0=
n
โ [( y โ y ) + ( y โ b )]
2
=i 1 =i 1
n
2
i
=i 1 =i 1
=
=
0
i
n
n
โ ( y โ y ) + 2โ ( y โ y )( y โ b ) + โ ( y โ b )
0
=i 1
i
n
2
0
n
โ ( yi โ y )2 + 2( y โ b0 )โ ( yi โ y ) + n( y โ b0 )2
=i 1 =i 1
=
n
โ ( y โ y ) + n( y โ b )
2
i =1
2
0
i
n
Where we use the fact (see Appendix A) that โ ( yi โ y ) =
0 always. The first term does not
i =1
depend on b0 , and the second term, n( y โ b0 ) , which is nonnegative, is clearly minimized when
2
b0 = y .
n
(ii) If we define u๏ฅ=
yi โ y , then =
โ u๏ฅi
i
n
โ ( y โ y ) , and we already used the fact that this
=i 1 =i 1
i
sum is zero in the proof in part (i).
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
17
SOLUTIONS TO COMPUTER EXERCISES
C2.1 (i) The average prate is about 87.36, and the average mrate is about .732.
(ii) The estimated equation is
๏ท
prate = 83.08 + 5.86 mrate
n = 1,534, R2 = .075.
(iii) The intercept implies that, even if mrate = 0, the predicted participation rate is 83.08
percent. The coefficient on mrate implies that a one-dollar increase in the match rate โ a fairly
large increase โ is estimated to increase prate by 5.86 percentage points. This assumes, of
course, that this change prate is possible (if, say, prate is already at 98, this interpretation makes
no sense).
ห = 83.08 + 5.86(3.5) = 103.59.
(iv) If we plug mrate = 3.5 into the equation, we get prate
This is impossible, as we can have at most a 100 percent participation rate. This illustrates that,
especially when dependent variables are bounded, a simple regression model can give strange
predictions for extreme values of the independent variable. (In the sample of 1,534 firms, only
34 have mrate โฅ 3.5.)
(v) mrate explains about 7.5% of the variation in prate. This is not much and suggests that
many other factors influence 401(k) plan participation rates.
C2.2 (i) Average salary is about 865.864, which means $865,864 because salary is in thousands
of dollars. Average ceoten is about 7.95.
(ii) There are five CEOs with ceoten = 0. The longest tenure is 37 years.
(iii) The estimated equation is
๏ท
log(
salary ) = 6.51 + .0097 ceoten
n = 177, R2 = .013.
We obtain the approximate percentage change in salary given โceoten = 1 by multiplying the
coefficient on ceoten by 100, 100(.0097) = .97%. Therefore, one more year as CEO is predicted
to increase salary by almost 1%.
C2.3 (i) The estimated equation is
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
18
๏ท = 3,586.4 โ .151 totwrk
sleep
n = 706, R2 = .103.
The intercept implies that the estimated amount of sleep per week for someone who does not
work is 3,586.4 minutes, or about 59.77 hours. This comes to about 8.5 hours per night.
(ii) If someone works two more hours per week, then โtotwrk = 120 (because totwrk is
๏ท = โ.151(120) = โ18.12 minutes. This is only a few minutes
measured in minutes), and so โ sleep
๏ท=
a night. If someone were to work one more hour on each of five working days, โ sleep
โ.151(300) = โ45.3 minutes, or about five minutes a night.
C2.4 (i) Average salary is about $957.95, and average IQ is about 101.28. The sample standard
deviation of IQ is about 15.05, which is pretty close to the population value of 15.
(ii) This calls for a level-level model:
๏ท = 116.99 + 8.30 IQ
wage
n = 935, R2 = .096.
An increase in IQ of 15 increases predicted monthly salary by 8.30(15) = $124.50 (in 1980
dollars). IQ score does not even explain 10% of the variation in wage.
(iii) This calls for a log-level model:
๏ท
log(
wage) = 5.89 + .0088 IQ
n = 935, R2 = .099.
๏ท
wage) = .0088(15) = .132, which is the (approximate) proportionate
If โIQ = 15, then โlog(
change in predicted wage. The percentage increase is therefore approximately 13.2.
C2.5 (i) The constant elasticity model is a log-log model:
log(rd) = ฮฒ 0 + ฮฒ1 log(sales) + u,
where ฮฒ1 is the elasticity of rd with respect to sales.
(ii) The estimated equation is
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
19
๏ท
log(
rd ) = โ4.105 + 1.076 log(sales)
n = 32, R2 = .910.
The estimated elasticity of rd with respect to sales is 1.076, which is just above one. A one
percent increase in sales is estimated to increase rd by about 1.08%.
C2.6 (i) It seems plausible that another dollar of spending has a larger effect for low-spending
schools than for high-spending schools. At low-spending schools, more money can go toward
purchasing more books, computers, and for hiring better qualified teachers. At high levels of
spending, we would expend little, if any, effect because the high-spending schools already have
high-quality teachers, nice facilities, plenty of books, and so on.
(ii) If we take changes as usual, we obtain
โmath10 =
ฮฒ1โ log(expend ) โ ( ฮฒ1 /100)(%โexpend ),
just as in the second row of Table 2.3. So, if %โexpend =
ฮฒ1 /10.
10, โmath10 =
(iii) The regression results are
๏ท
math10 =
โ69.34 + 11.16 log(expend )
=
n 408,
=
R 2 .0297.
๏ท
(iv) If expend increases by 10 percent, math
10 increases by about 1.1 percentage points.
This is not a huge effect, but it is not trivial for low-spending schools, where a 10 percent
increase in spending might be a fairly small dollar amount.
(v) In this data set, the largest value of math10 is 66.7, which is not especially close to 100.
In fact, the largest fitted values is only about 30.2.
C2.7 (i) The average gift is about 7.44 Dutch guilders. Out of 4,268 respondents, 2,561 did not
give a gift, or about 60 percent.
(ii) The average mailings per year is about 2.05. The minimum value is .25 (which
presumably means that someone has been on the mailing list for at least four years), and the
maximum value is 3.5.
(iii) The estimated equation is
๏ท
=
gift 2.01 + 2.65 mailsyear
=
n 4,268,
=
R 2 .0138.
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
20
(iv) The slope coefficient from part (iii) means that each mailing per year is associated with โ
perhaps even โcausesโ โ an estimated 2.65 additional guilders, on average. Therefore, if each
mailing costs one guilder, the expected profit from each mailing is estimated to be 1.65 guilders.
This is only the average, however. Some mailings generate no contributions, or a contribution
less than the mailing cost; other mailings generated much more than the mailing cost.
(v) Because the smallest mailsyear in the sample is .25, the smallest predicted value of gifts
is 2.01 + 2.65(.25) โ 2.67. Even if we look at the overall population, where some people have
received no mailings, the smallest predicted value is about two. So, with this estimated equation,
we never predict zero charitable gifts.
C2.8 There is no โcorrectโ answer to this question because all answers depend on how the
random outcomes are generated. I used Stata 11 and, before generating the outcomes on the xi , I
set the seed to the value 123. I reset the seed to 123 to generate the outcomes on the ui .
Specifically, to answer parts (i) through (v), I used the sequence of commands
set obs 500
set seed 123
gen x = 10*runiform()
sum x
set seed 123
gen u = 6*rnormal()
sum u
gen y = 1 + 2*x + u
reg y x
predict uh, resid
gen x_uh = x*uh
sum uh x_uh
gen x_u = x*u
sum u x_u
(i) The sample mean of the xi is about 4.912 with a sample standard deviation of about 2.874.
(ii) The sample average of the ui is about .221, which is pretty far from zero. We do not get
zero because this is just a sample of 500 from a population with a zero mean. The current sample
is โunluckyโ in the sense that the sample average is far from the population average. The sample
standard deviation is about 5.768, which is nontrivially below 6, the population value.
(iii) After generating the data on yi and running the regression, I get, rounding to three
decimal places,
ฮฒห0 = 1.862 and ฮฒห1 = 1.870
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
21
The population values are 1 and 2, respectively. Thus, the estimated intercept based on this
sample of data is well above the population value. The estimated slope is somewhat below the
population value, 2. When we sample from a population our estimates contain sampling error;
that is why the estimates differ from the population values.
(iv) When I use the command sum uh x_uh and multiply by 500, I get, using scientific
notation, sums equal to 4.181e-06 and .00003776, respectively. These are zero for practical
purposes and differ from zero only due to rounding inherent in the machine imprecision (which
is unimportant).
(v) We already computed the sample average of the ui in part (ii). When we multiply by 500
the sample average is about 110.74. The sum of xi ui is about 6.46. Neither is close to zero, and
nothing says they should be particularly close.
(vi) For this part I set the seed to 789. The sample average and standard deviation of the xi
are about 5.030 and 2.913; those for the ui are about โ.077 and 5.979. When I generated the yi
and run the regression I get
ฮฒห0 = .701 and ฮฒห1 = 2.044 .
These are different from those in part (iii) because they are obtained from a different random
sample. Here, for both the intercept and slope, we get estimates that are much closer to the
population values. Of course, in practice we would never know that.
C2.9 (i) In 1996, 1,051 counties had zero murders. Out of 2,197 counties, 31 counties had at
least one execution and the largest number of executions is 3.
(ii) The estimated equation is
?????????????? = 5.46 + 58.56 ??????????
?? = 2197, ??2 = 0.0439.
(iii) The slope coefficient on execs implies that if the number of executions increases by
one, the estimated number of murders increases largely by about 59. No, the
estimated equation does not suggest a deterrent effect of capital punishment.
(iv) The smallest number of murders can be predicted by the equation is 5.46, that is
about 5 murders. The residual for a county with zero executions and zero murders is
-5.46.
(v) This simple linear regression equation predicts that if the number of executions
increases by one, the estimated number of murders increases largely by about 59,
which means capital punishment does not have a deterrent effect on murders โ
capital punishment is not discouraging people from doing murders. The sign and
magnitude of the estimate +58.56 makes us suspect that the error term u and the
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
22
independent variable execs are correlated. Therefore, the regression model is not
well suited for prediction.
C2.10 (i) The number of students in the sample is 7,430. The mean of math12 is about 52.13
and mean of read12 is about 51.77. The standard deviations of math12 and read12 are about
9.46 and 9.41, respectively.
(ii) The estimated equation is
๏ฟฝ = 15.153 + 0.714 ????????12
??????โ12
?? = 7430, ??2 = 0.5047.
(iii) No. The intercept does not have a meaningful interpretation because the reading test
score is not close to zero for the population of interest.
(iv) No. ??2 = 0.5047 implies that about 50.47% variation in math score is explained by
reading score.
(v) If we run the regression of read12 on math12, we obtained almost similar results as
the regression of math12 on read12. So it is better to hire more reading and math
tutors.
ยฉ 2016 Cengage Learningยฎ. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
Document Preview (15 of 319 Pages)
User generated content is uploaded by users for the purposes of learning and should be used following SchloarOn's honor code & terms of service.
You are viewing preview pages of the document. Purchase to get full access instantly.
-37%
Solution Manual For Introductory Econometrics: A Modern Approach, 6th Edition
$18.99 $29.99Save:$11.00(37%)
24/7 Live Chat
Instant Download
100% Confidential
Store
Emma Johnson
0 (0 Reviews)
Best Selling
The World Of Customer Service, 3rd Edition Test Bank
$18.99 $29.99Save:$11.00(37%)
Chemistry: Principles And Reactions, 7th Edition Test Bank
$18.99 $29.99Save:$11.00(37%)
Test Bank for Hospitality Facilities Management and Design, 4th Edition
$18.99 $29.99Save:$11.00(37%)
Solution Manual for Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th Edition
$18.99 $29.99Save:$11.00(37%)
Data Structures and Other Objects Using C++ 4th Edition Solution Manual
$18.99 $29.99Save:$11.00(37%)
2023-2024 ATI Pediatrics Proctored Exam with Answers (139 Solved Questions)
$18.99 $29.99Save:$11.00(37%)