Statistics for Business and Economics, 13th Edition Solution Manual
Preview Extract
Chapter 2
Methods for Describing Sets of Data
2.1
First, we find the frequency of the grade A. The sum of the frequencies for all five grades must be 200.
Therefore, subtract the sum of the frequencies of the other four grades from 200. The frequency for grade
A is:
200 ๏ญ (36 + 90 + 30 + 28) = 200 ๏ญ 184 = 16
To find the relative frequency for each grade, divide the frequency by the total sample size, 200. The
relative frequency for the grade B is 36/200 = .18. The rest of the relative frequencies are found in a
similar manner and appear in the table:
Grade on Statistics Exam
A: 90 ๏ญ100
B: 80 ๏ญ 89
C: 65 ๏ญ 79
D: 50 ๏ญ 64
F: Below 50
Total
2.2
a.
Relative Frequency
.08
.18
.45
.15
.14
1.00
To find the frequency for each class, count the number of times each letter occurs. The frequencies
for the three classes are:
Class
X
Y
Z
Total
b.
Frequency
16
36
90
30
28
200
Frequency
8
9
3
20
The relative frequency for each class is found by dividing the frequency by the total sample size. The
relative frequency for the class X is 8/20 = .40. The relative frequency for the class Y is 9/20 = .45.
The relative frequency for the class Z is 3/20 = .15.
Class
X
Y
Z
Total
Frequency
8
9
3
20
Relative Frequency
.40
.45
.15
1.00
10
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
c.
The frequency bar chart is:
9
8
Frequency
7
6
5
4
3
2
1
0
d.
X
Y
C la s s
Z
The pie chart for the frequency distribution is:
Pie Chart of Class
Category
X
Y
Z
Z
15.0%
X
40.0%
Y
45.0%
2.3
a.
pU ๏ฝ
107
๏ฝ .615
174
b.
pS ๏ฝ
57
๏ฝ .328
174
c.
pR ๏ฝ
10
๏ฝ .057
174
d.
.615 ๏จ 360๏ฉ ๏ฝ 221.4 , .328 ๏จ 360๏ฉ ๏ฝ 118.1 , .057 ๏จ 360๏ฉ ๏ฝ 20.5
Copyright ยฉ 2018 Pearson Education, Inc.
11
12
Chapter 2
e.
Using MINITAB, the pie chart is:
Pie Chart of Location
Category
Urban
Suburban
Rural
Rural
5.7%
Suburban
32.8%
Urban
61.5%
f.
61.5% of the STEM participants are from urban areas, 32.8% are from suburban areas, and 5.7% are
from rural areas.
g.
Using MINITAB, the bar chart is:
70
60
Percent
50
40
30
20
10
0
Urban
Suburban
Rural
Loc
Percent is calculated within all data.
Both charts give the same information.
2.4
a.
According to the pie chart, .760 of the sample currently have.a cable/satellite TV subscription at
home. The total number of adults sampled who have a cable/satellite TV subscription at home is
1,521 ๏ซ 180 ๏ซ 300 ๏ฝ 2,001 . The proportion is
1,521
๏ฝ .760 .
2,001
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
b.
13
Using MINITAB, the pie chart is:
Pie Chart of Subscribe
Category
Cable TV
Cord cutter
Cord cutter
16.5%
Cable TV
83.5%
a.
The type of graph is a bar graph.
b.
The variable measured for each of the robots is type of robotic limbs.
c.
From the graph, the design used the most is the โlegs onlyโ design.
d.
The relative frequencies are computed by dividing the frequencies by the total sample size. The total
sample size is n = 106. The relative frequencies for each of the categories are:
Type of Limbs
None
Both
Legs ONLY
Wheels ONLY
Total
e.
Frequency
15
8
63
20
106
Relative Frequency
15/106 = .142
8 / 106 = .075
63/106 = .594
20/106 = .189
1.000
Using MINITAB, the Pareto diagram is:
.60
.50
Relative Frequency
2.5
.40
.30
.20
.10
0
Legs
Wheels
None
Both
Type
Percent within all data.
Copyright ยฉ 2018 Pearson Education, Inc.
14
Chapter 2
2.6
a.
Region is qualitative because it is not measured using numbers.
b.
pA๏ญ P ๏ฝ
c.
Using MINITAB, the plot is
48
๏ฝ .32 ,
150
26
pUS ๏ฝ
๏ฝ .17
150
pC ๏ฝ
10
๏ฝ .07 ,
150
pE ๏ฝ
34
๏ฝ .23 ,
150
pLA ๏ฝ
29
๏ฝ .19 ,
150
pME / A ๏ฝ
3
๏ฝ .02 ,
150
35
30
Percent
25
20
15
10
5
0
Asia-Pacific
Canada
Europe
Latin America Middle East/Africa United States
Region
Percent is calculated within all data.
2.7
d.
The regions that most of the top 150 credit card users serve are Asia-Pacific, Europe, Latin America,
and the United States.
a.
Using MINITAB, the pie chart is:
Pie Chart of Product
Explorer
12.0%
Office
24.0%
Category
Office
Windows
Explorer
Windows
64.0%
Explorer had the lowest proportion of security issues with the proportion
Copyright ยฉ 2018 Pearson Education, Inc.
6
๏ฝ .12 .
50
Methods for Describing Sets of Data
b.
Using MINITAB, the Pareto chart is:
50
40
Percent
30
20
10
0
e
ot
m
Re
co
de
u
ec
ex
tio
n
il
iv
Pr
e
eg
io
at
ev
el
n
n
io
at
m
or
f
In
lo
sc
di
re
su
n
De
lo
ia
e
vic
er
fs
oo
Sp
fin
g
Bulletins
Percent is calculated within all data.
The security bulletin with the highest frequency is Remote code execution. Microsoft should focus
on this repercussion.
a.
Using MINITAB, the Pareto chart is:
40
30
Percent
2.8
20
10
0
WLAN/Single
WLAN/Multi
WSN/SINGLE
WSN/Multi
AHN/SINGLE
AHN/Multi
Network/Channel
Percent is calculated within all data.
The network type and number of channels that suffered the most number of jamming attacks is
WLAN/Single. The network/number of channels type that received the next most number of
jamming attacks is WSN/Single and WLAN/Multi. The network/Number of channels type that
suffered the least number of jamming attacks is AHN/Multi.
b.
Using MINITAB, the pie chart is:
Pie Chart of Network
Category
WLAN
WSN
AHN
AHN
16.3%
WSN
27.5%
WLAN
56.3%
Copyright ยฉ 2018 Pearson Education, Inc.
15
16
Chapter 2
The network type that suffered the most jamming attacks is WLAN with more than half. The
network type that suffered the least number of jamming attacks is AHN.
2.9
Using MINITAB, the pie chart is:
Pie Chart of Degree
Category
None
First
Post
Post
10.4%
None
36.9%
First
52.7%
A little of half of the successful candidates had a First (Bachelorโs) degree, while a little more than a third
of the successful candidates had no degree. Only about 10% of the successful candidates had graduate
degrees.
Using MINITAB, the bar graphs of the 2 waves is:
Sch
NoWorkGrad
NoWorkBusSch
Sch
2
NoWorkGrad
WorkMBA
WorkNoMBA
NoWorkBusSch
1
90
80
70
60
50
40
30
20
10
0
WorkMBA
WorkNoMBA
Chart of Job Status
Percent
2.10
Job Status
Panel variable: Wave; Percent within all data.
In wave 1, most of those taking the GMAT were working ๏จ 2657 / 3244 ๏ฝ .819๏ฉ and none had MBAโs. About
20% were not working but were in either a 4-year institution or other graduate school
๏จ๏ 36 ๏ซ 551๏ / 3244 ๏ฝ .181๏ฉ . In wave 2, almost all were now working ๏จ๏1787 ๏ซ 1372๏ / 3244 ๏ฝ .974๏ฉ . Of those
working, more than half had MBAโs ๏จ1787 / ๏1787 ๏ซ 1372๏ ๏ฝ .566๏ฉ . Of those not working, most were in
another graduate school.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.11
17
Using MINITAB, the Pareto diagram for the data is:
Chart of Tenants
50
Percent
40
30
20
10
0
Small
SmallStandard
Large
Tenants
Major
Anchor
Percent within all data.
Most of the tenants in UK shopping malls are small or small standard. They account for approximately
84% of all tenants ๏จ ๏ 711 ๏ซ 819๏ / 1,821 ๏ฝ .84๏ฉ . Very few (less than 1%) of the tenants are anchors.
Using MINITAB, the side-by-side bar graphs are:
Chart of Acquisitions
No
1980
Yes
1990
100
75
50
Percent
2.12
25
0
2000
100
75
50
25
0
No
Yes
Acquisitions
Panel variable: Year; Percent within all data.
In 1980, very few firms had acquisitions ๏จ18 / 1, 963 ๏ฝ .009 ๏ฉ . By 1990, the proportion of firms having
acquisitions increased to 350 / 2,197 ๏ฝ .159 . By 2000, the proportion of firms having acquisitions increased
to 748 / 2,778 ๏ฝ .269 .
Copyright ยฉ 2018 Pearson Education, Inc.
18
Chapter 2
2.13
a.
Using MINITAB, the pie chart of the data is:
Pie Chart of City
Category
SF
NY
LA
CH
SF
25.0%
CH
25.9%
LA
16.1%
NY
33.0%
b.
Using MINITAB, the pie chart for San Francisco is:
Pie Chart of Rating
City = SF
Category
Excellent
Good
Bad
Excellent
10.1%
Bad
21.7%
Good
68.1%
Using MINITAB, the bar charts are:
Chart of Tweets
Excellent
CH
Good
Bad
LA
60
45
Percent of Tweets
c.
30
15
NY
SF
60
45
30
15
0
Excellent
Good
Bad
Rating
Panel variable: City
Percent is calculated within all data.
Copyright ยฉ 2018 Pearson Education, Inc.
0
Methods for Describing Sets of Data
d.
2.14
19
In all cities, most customers rated the iphone 6 as โgoodโ, while very few rated the iphone 6 as
excellent.
Using MINITAB, a pie chart of the data is:
Pie Chart of Measure
Category
Big Shows
Funds Raised
Members
Paying visitors
Total visitors
Big Shows
20.0%
Total visitors
26.7%
Funds Raised
23.3%
Paying visitors
16.7%
Members
13.3%
Since the sizes of the slices are close to each other, it appears that the researcher is correct. There is a large
amount of variation within the museum community with regard to performance measurement and
evaluation.
2.15
a.
The variable measured by Performark is the length of time it took for each advertiser to respond back.
b.
The pie chart is:
Pie Chart of Response Time
13-59 days
25.6%
Never responded
23.3%
Category
Never responded
>120 days
60-120 days
13-59 days
>120 days
13.3%
60-120 days
37.8%
c.
Twenty-one percent or .21๏ด17,000 ๏ฝ 3,570 of the advertisers never respond to the sales lead.
d.
The information from the pie chart does not indicate how effective the “bingo cards” are. It just
indicates how long it takes advertisers to respond, if at all.
Copyright ยฉ 2018 Pearson Education, Inc.
20
2.16
Chapter 2
Using MINITAB, the side-by-side bar graphs are:
Chart of Dive
Left
Ahead
Middle
Right
Behind
80
60
Percent
40
20
0
Tied
80
60
40
20
0
Left
Middle
Right
Dive
Panel variable: Situation; Percent within all data.
From the graphs, it appears that if the team is either tied or ahead, the goal-keepers tend to dive either right
or left with equal probability, with very few diving in the middle. However, if the team is behind, then the
majority of goal-keepers tend to dive right (71%).
a.
Using MINITAB, bar charts for the 3 variables are:
Chart of Well Class
120
100
80
Count
2.17
60
40
20
0
Private
Public
Well Class
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
Chart of Aquifer
200
Count
150
100
50
0
Bedrock
Unconsolidated
Aquifer
Chart of Detection
160
140
120
Count
100
80
60
40
20
0
Below Limit
Detect
Detection
Using MINITAB, the side-by-side bar chart is:
Chart of Detection
Below Limit
Private
Detect
Public
80
70
60
Percent
b.
50
40
30
20
10
0
Below Limit
Detect
Detection
Panel variable: Well Class; Percent within all data.
Copyright ยฉ 2018 Pearson Education, Inc.
21
22
Chapter 2
c.
Using MINITAB, the side-by-side bar chart is:
Chart of Detection
Below Limit
Bedrock
Detect
Unconsoli
70
60
Percent
50
40
30
20
10
0
Below Limit
Detect
Detection
Panel variable: Aquifer; Percent within all data.
d.
Using MINITAB, the relative frequency histogram is:
.25
.20
Relative Frequency
2.18
From the bar charts in parts a-c, one can infer that most aquifers are bedrock and most levels of
MTBE were below the limit (๏ป 2 / 3) . Also the percentages of public wells verses private wells are
relatively close. Approximately 80% of private wells are not contaminated, while only about 60% of
public wells are not contaminated. The percentage of contaminated wells is about the same for both
types of aquifers (๏ป 30%) .
.15
.10
.05
0
0
.5
2.5
4.5
6.5
8.5
Class
10.5
12.5
14.5
16.5
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.19
23
To find the number of measurements for each measurement class, multiply the relative frequency by the
total number of observations, n = 500. The frequency table is:
Measurement Class
Relative Frequency
.10
.5 ๏ญ 2.5
.15
2.5 ๏ญ 4.5
.25
4.5 ๏ญ 6.5
.20
6.5 ๏ญ 8.5
.05
8.5 ๏ญ 10.5
.10
10.5 ๏ญ 12.5
.10
12.5 ๏ญ 14.5
.05
14.5 ๏ญ 16.5
Frequency
500(.10) = 50
500(.15) = 75
500(.25) = 125
500(.20) = 100
500(.05) = 25
500(.10) = 50
500(.10) = 50
500(.05) = 25
500
Using MINITAB, the frequency histogram is:
140
120
Frequency
100
80
60
40
20
0
2.20
0
.5
2.5
4.5
6.5
8.5
Class
10.5
12.5
14.6
16.5
a.
The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations.
b.
For the bottom row of the stem-and-leaf display:
The stem is 0.
The leaves are 0, 1, 2.
Assuming that the data are up to two digits, rounded off to the nearest whole number, the
numbers in the original data set are 0, 1, and 2.
2.21
c.
Again, assuming that the data are up to two digits, rounded off to the nearest whole number, the dot
plot corresponding to all the data points is:
a.
This is a frequency histogram because the number of observations is graphed for each interval rather
than the relative frequency.
b.
There are 14 measurement classes.
Copyright ยฉ 2018 Pearson Education, Inc.
2.22
2.23
2.24
Chapter 2
c.
There are 49 measurements in the data set.
a.
The graph is a frequency histogram.
b.
The quantitative variable summarized in the graph is the fup/fumic ratio.
c.
The proportion of ratios greater than 1 is
d.
The proportion of ratios less than .4 is
a.
Since the label on the vertical axis is Percent, this is a relative frequency histogram. We can divide
the percents by 100% to get the relative frequencies.
b.
Summing the percents represented by all of the bars above 100, we get approximately 12%.
a.
Using MINITAB, the stem-and-leaf display and histogram are:
8 ๏ซ 5 ๏ซ 1 14
๏ฝ
๏ฝ .034 .
416
416
181 ๏ซ 108 289
๏ฝ
๏ฝ .695 .
416
416
Stem-and-Leaf Display: SCORE
Stem-and-leaf of SCORE
Leaf Unit = 1.0
1
1
2
2
3
4
4
4
6
16
26
40
57
88
(47)
60
26
6
7
7
7
7
7
8
8
8
8
8
9
9
9
9
9
10
N
= 195
9
3
6
8
44
6666677777
8888899999
00001111111111
22222222223333333
4444444444444555555555555555555
66666666666666666666777777777777777777777777777
8888888888999999999999999999999999
00000000000000000000000000
Histogram of SCORE
50
40
Frequency
24
30
20
10
0
72
76
80
84
88
92
96
100
SCORE
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.25
b.
From the stem-and-leaf display, there are only 6 observations with sanitation scores less than 86. The
proportion of ships with accepted sanitation standards is (195 ๏ญ 6) / 195 ๏ฝ 189 / 195 ๏ฝ .97 .
c.
The score of 69 is highlighted in the stem-and-leaf display.
a.
Using MINITAB, a dot plot of the data is:
Dotplot of Acquisitions
0
120
240
360
480
Acquisitions
600
720
840
b.
By looking at the dot plot, one can conclude that the years 1996-2000 had the highest number of
firms with at least one acquisition. The lowest number of acquisitions in that time frame (748) is
almost 100 higher than the highest value from the remaining years.
a.
Using MINITAB, a histogram of the current values of the 32 NFL teams is:
Histogram of VALUE ($mil)
16
14
12
Frequency
2.26
25
10
8
6
4
2
0
1800
2400
3000
3600
VALUE ($mil)
Copyright ยฉ 2018 Pearson Education, Inc.
Chapter 2
b.
Using MINITAB, a histogram of the 1-year change in current value for the 32 NFL teams is:
Histogram of CHANGE (%)
7
6
Frequency
5
4
3
2
1
0
20
30
40
50
60
70
CHANGE (%)
c.
Using MINITAB, a histogram of the debt-to-value ratios for the 32 NFL teams is:
Histogram of DEBT/VALUE (%)
14
12
Frequency
10
8
6
4
2
0
10
20
30
40
50
DEBT/VALUE (%)
d.
Using MINITAB, a histogram of the annual revenues for the 32 NFL teams is:
Histogram of REVENUE ($ mil)
18
16
14
12
Frequency
26
10
8
6
4
2
0
300
350
400
450
500
550
600
REVENUE ($mil)
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
e.
27
Using MINITAB, a histogram of the operating incomes for the 32 NFL teams is:
Histogram of INCOME ($mil)
12
Frequency
10
8
6
4
2
0
60
120
180
240
INCOME ($mil)
For all of the histograms, there is 1 team that has a very high score. The Dallas Cowboys have the
largest values for current value, annual revenues, and operating income. However, the San Francisco
49ers have the highest 1-year change, while the Atlanta Falcons have the highest debt-to-value ratio.
All of the graphs except the one showing the 1-Yr Value Changes are skewed to the right.
a.
Using MINITAB, the frequency histograms for 2014 and 2010 SAT mathematics scores are:
Histogram of MATH2014, MATH2010
440
480
520
MATH2014
14
560
600
MATH2010
12
Frequency
10
8
6
4
2
0
440
480
520
560
600
It appears that the scores have not changed very much at all. The graphs are very similar.
b.
Using MINITAB, the frequency histogram of the differences is:
Histogram of Diff Math
30
25
Frequency
2.27
f.
20
15
10
5
0
-90
-60
-30
0
30
Diff Math
Copyright ยฉ 2018 Pearson Education, Inc.
28
Chapter 2
From this graph of the differences, we can see that there are more observations to the right of 0 than
to the left of 0. This indicates that, in general, the scores have improved since 2010.
c.
2.28
From the graph, the largest improvement score is between 22.5 and 37.5. The actual largest score is
34 and it is associated with Wyoming.
Using MINITAB, the two dot plots are:
Dotplot of Arrive, Depart
Arrive
Depart
108
120
132
144
156
168
Data
Yes. Most of the numbers of items arriving at the work center per hour are in the 135 to 165 area. Most of
the numbers of items departing the work center per hour are in the 110 to 140 area. Because the number of
items arriving is larger than the number of items departing, there will probably be some sort of bottleneck.
2.29
Using MINITAB, the stem-and-leaf display is:
Stem-and-Leaf Display: Dioxide
Stem-and-leaf of Dioxide
Leaf Unit = 0.10
5
7
(2)
7
7
5
5
4
4
0
0
1
1
2
2
3
3
4
N
= 16
12234
55
34
44
3
0000
The highlighted values are values that correspond to water specimens that contain oil. There is a tendency
for crude oil to be present in water with lower levels of dioxide as 6 of the lowest 8 specimens with the
lowest levels of dioxide contain oil.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.30
a.
29
Using MINTAB, the histograms of the number of deaths is:
Histogram of Deaths
12
10
Frequency
8
6
4
2
0
0
200
400
600
800
1000
Deaths
b.
The interval containing the largest proportion of estimates is 0-50. Almost half of the estimates fall
in this interval.
2.31
Yes, we would agree with the statement that honey may be the preferable treatment for the cough and sleep
difficulty associated with childhood upper respiratory tract infection. For those receiving the honey
dosage, 14 of the 35 children (or 40%) had improvement scores of 12 or higher. For those receiving the
DM dosage, only 9 of the 33 (or 24%) children had improvement scores of 12 or higher. For those
receiving no dosage, only 2 of the 37 children (or 5%) had improvement scores of 12 or higher. In
addition, the median improvement score for those receiving the honey dosage was 11, the median for those
receiving the DM dosage was 9 and the median for those receiving no dosage was 7.
2.32
a.
Using MINITAB, the stem-and-leaf display is as follows, where the stems are the units place and the
leaves are the decimal places:
Stem-and-Leaf Display: Time
Stem-and-leaf of Time
Leaf Unit = 0.10
(26)
23
15
9
4
2
2
1
1
1
b.
1
2
3
4
5
6
7
8
9
10
N
= 49
00001122222344444445555679
11446799
002899
11125
24
8
1
A little more than half (26/49 = .53) of all companies spent less than 2 months in bankruptcy. Only
two of the 49 companies spent more than 6 months in bankruptcy. It appears that, in general, the
length of time in bankruptcy for firms using “prepacks” is less than that of firms not using prepacks.”
Copyright ยฉ 2018 Pearson Education, Inc.
30
Chapter 2
c.
A dot diagram will be used to compare the time in bankruptcy for the three types of “prepack” firms:
Votes
Dotplot of Time vs Votes
Joint
None
Prepack
1.2
2.4
3.6
4.8
6.0
7.2
8.4
9.6
Time
d.
Using MINITAB, the histogram of the data is:
Histogram of INTTIME
60
50
40
Frequency
2.33
The highlighted times in part a correspond to companies that were reorganized through a leverage
buyout. There does not appear to be any pattern to these points. They appear to be scattered about
evenly throughout the distribution of all times.
30
20
10
0
0
75
150
225
300
INTTIME
375
450
525
This histogram looks very similar to the one shown in the problem. Thus, there appears that there was
minimal or no collaboration or collusion from within the company. We could conclude that the phishing
attack against the organization was not an inside job.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.34
31
Using MINITAB, the stem-and-leaf display for the data is:
Stem-and-Leaf Display: Time
Stem-and-leaf of Time
Leaf Unit = 1.0
3
7
(7)
11
6
4
2
1
N
= 25
3 239
4 3499
5 0011469
6 34458
7 13
8 26
9 5
10 2
The numbers in bold represent delivery times associated with customers who subsequently did not place
additional orders with the firm. Since there were only 2 customers with delivery times of 68 days or longer
that placed additional orders, I would say the maximum tolerable delivery time is about 65 to 67 days.
Everyone with delivery times less than 67 days placed additional orders.
2.35
Assume the data are a sample. The sample mean is:
x๏ฝ
๏ฅ x ๏ฝ 3.2 ๏ซ 2.5 ๏ซ 2.1 ๏ซ 3.7 ๏ซ 2.8 ๏ซ 2.0 ๏ฝ 16.3 ๏ฝ 2.717
n
6
6
The median is the average of the middle two numbers when the data are arranged in order (since n = 6 is
even). The data arranged in order are: 2.0, 2.1, 2.5, 2.8, 3.2, 3.7. The middle two numbers are 2.5 and 2.8.
The median is:
2.5 ๏ซ 2.8 5.3
๏ฝ
๏ฝ 2.65
2
2
2.36
๏ฅ x ๏ฝ 85 ๏ฝ 8.5
a.
x๏ฝ
b.
x๏ฝ
400
๏ฝ 25
16
c.
x๏ฝ
35
๏ฝ .778
45
d.
x๏ฝ
242
๏ฝ 13.44
18
n
10
2.37
The mean and median of a symmetric data set are equal to each other. The mean is larger than the median
when the data set is skewed to the right. The mean is less than the median when the data set is skewed to
the left. Thus, by comparing the mean and median, one can determine whether the data set is symmetric,
skewed right, or skewed left.
2.38
The median is the middle number once the data have been arranged in order. If n is even, there is not a
single middle number. Thus, to compute the median, we take the average of the middle two numbers. If n
is odd, there is a single middle number. The median is this middle number.
Copyright ยฉ 2018 Pearson Education, Inc.
32
Chapter 2
A data set with five measurements arranged in order is 1, 3, 5, 6, 8. The median is the middle number,
which is 5.
A data set with six measurements arranged in order is 1, 3, 5, 5, 6, 8. The median is the average of the
5 ๏ซ 5 10
middle two numbers which is
๏ฝ
๏ฝ 5.
2
2
2.39
Assume the data are a sample. The mode is the observation that occurs most frequently. For this sample,
the mode is 15, which occurs three times.
The sample mean is:
x๏ฝ
๏ฅ x ๏ฝ 18 ๏ซ 10 ๏ซ 15 ๏ซ 13 ๏ซ 17 ๏ซ 15 ๏ซ 12 ๏ซ 15 ๏ซ 18 ๏ซ 16 ๏ซ 11 ๏ฝ 160 ๏ฝ 14.545
n
11
11
The median is the middle number when the data are arranged in order. The data arranged in order are: 10,
11, 12, 13, 15, 15, 15, 16, 17, 18, 18. The middle number is the 6th number, which is 15.
2.40
a.
b.
c.
2.41
2.42
x๏ฝ
๏ฅ x ๏ฝ 7 ๏ซ ๏ ๏ซ 4 ๏ฝ 15 ๏ฝ 2.5
x๏ฝ
๏ฅ x ๏ฝ 2 ๏ซ ๏ ๏ซ 4 ๏ฝ 40 ๏ฝ 3.08
x๏ฝ
๏ฅ x ๏ฝ 51 ๏ซ ๏ ๏ซ 37 ๏ฝ 496 ๏ฝ 49.6
n
6
6
3๏ซ3
Median =
๏ฝ 3 (mean of 3rd and 4th numbers, after ordering)
2
Mode = 3
n
13
13
Median = 3 (7th number, after ordering)
Mode = 3
n
10
10
48 ๏ซ 50
Median =
๏ฝ 49 (mean of 5th and 6th numbers, after ordering)
2
Mode = 50
a.
For a distribution that is skewed to the left, the mean is less than the median.
b.
For a distribution that is skewed to the right, the mean is greater than the median.
c.
For a symmetric distribution, the mean and median are equal.
a.
The average score for Energy Star is 4.44. The average score is close to 5 meaning the average score
is close to โvery familiarโ.
b.
The median score for Energy Star is 5. At least half of the respondents indicated that they are very
familiar with the ecolabel Energy Star.
c.
The mode score for Energy Star is 5. More respondents answered โvery familiarโ to Energy Star than
any other option.
d.
The ecolabel that appears to be most familiar to travelers is Energy Star.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.43
2.44
33
a.
This statistic represents a population mean because it is computed for every freshman who attended
the university in 2015. The average financial aid awarded to freshmen at Harvard University is
$41,555.
b.
This statistic represents a sample median because it is computed for a sample of alumni. The median
salary during early career for alumni of Harvard University is $61,400. Half of the alumni from
Harvard make more than $61,400 during their early career.
a.
The mean is
๏ฅ x ๏ฝ 9 ๏ซ (๏ญ.1) ๏ซ (๏ญ1.6) ๏ซ 14.6 ๏ซ 16.0 ๏ซ 7.7 ๏ซ 19.9 ๏ซ 9.8 ๏ซ 3.2 ๏ซ 24.8 ๏ซ 17.6 ๏ซ 10.7 ๏ซ 9.1 ๏ฝ 140.7 ๏ฝ 10.82
x๏ฝ
n
13
13
The average annualized percentage return on investment for 13 randomly selected stock screeners is
10.82.
b.
Since the number of observations is odd, the median is the middle number once the data have been
arranged in order. The data arranged in order are:
-1.6 -.1 3.2 7.7 9.0 9.1 9.8 10.7 14.6 16.0 17.6 19.9 24.8
The middle number is 9.8 which is the median. Half of the annualized percentage returns on
investment are below 9.8 and half are above 9.8.
2.45
a.
The mean years of experience is x ๏ฝ ๏ฅ ๏ฝ
x
n
30 ๏ซ 15 ๏ซ 10 ๏ซ ๏ ๏ซ 25 303
๏ฝ
๏ฝ 17.824 . The average number
17
17
of years of experience is 17.824 years.
b.
To find the median, we first arrange the data in order from lowest to highest:
3 5 6 9 10 10 10 15 20 20 25 25 25 30 30 30 30
Since there are an odd number of observations, the median is the middle number which is 20. Half of
interviewees have less than 20 years of experience.
2.46
c.
The mode is 30. More interviewees had 30 years of experience than any other value.
a.
The sample mean is:
n
๏ฅx
x ๏ฝ i ๏ฝ1
n
i
๏ฝ
1.72 ๏ซ 2.50 ๏ซ 2.16 ๏ซ ๏๏๏ ๏ซ 1.95 37.62
๏ฝ
๏ฝ 1.881
20
20
The sample average surface roughness of the 20 observations is 1.881.
b.
The median is found as the average of the 10th and 11th observations, once the data have been
ordered. The ordered data are:
1.06 1.09 1.19 1.26 1.27 1.40 1.51 1.72 1.95 2.03 2.05 2.13 2.13 2.16 2.24 2.31 2.41 2.50 2.57 2.64
The 10th and 11th observations are 2.03 and 2.05. The median is:
2.03 ๏ซ 2.05 4.08
๏ฝ
๏ฝ 2.04
2
2
Copyright ยฉ 2018 Pearson Education, Inc.
34
Chapter 2
The middle surface roughness measurement is 2.04. Half of the sample measurements were less than
2.04 and half were greater than 2.04.
2.47
2.48
2.49
c.
The data are somewhat skewed to the left. Thus, the median might be a better measure of central
tendency than the mean. The few small values in the data tend to make the mean smaller than the
median.
a.
The mean permeability for group A sandstone slices is 73.62mD. The average permeability for group
A sandstone is 73.62mD. The median permeability for group A sandstone is 70.45mD. Half of the
sandstone slices in group A have permeability less than 70.45mD.
b.
The mean permeability for group B sandstone slices is 128.54mD. The average permeability for
group B sandstone is 128.54mD. The median permeability for group B sandstone is 139.30mD. Half
of the sandstone slices in group B have permeability less than 139.30mD.
c.
The mean permeability for group C sandstone slices is 83.07mD. The average permeability for group
C sandstone is 83.07mD. The median permeability for group C sandstone is 78.650mD. Half of the
sandstone slices in group C have permeability less than 78.65mD.
d.
The mode permeability score for group C sandstone is 70.9. More sandstone slices in group C had
permeability scores of 70.9 than any other value.
e.
Weathering type B appears to result in faster decay because the mean, median, and mode values fore
group B is higher than those for group C.
a.
The mean is 67.755. The statement is accurate.
b.
The median is 68.000. The statement is accurate.
c.
The mode is 64. The statement is not accurate. A better statement would be: โThe most common
reported level of support for corporate sustainability for the 992 senior managers was 64.
d.
Since the mean and median are almost the same, the distribution of the 992 support levels should be
fairly symmetric. The histogram in Exercise 2.23 is almost symmetric.
a.
The median is the middle number (18th) once the data have been arranged in order because n = 35 is
odd. The honey dosage data arranged in order are:
4,5,6,8,8,8,8,9,9,9,9,10,10,10,10,10,10,11,11,11,11,12,12,12,12,12,12,13,13,14,15,15,15,15,16
The 18th number is the median = 11.
b.
The median is the middle number (17th) once the data have been arranged in order because n = 33 is
odd. The DM dosage data arranged in order are:
3,4,4,4,4,4,4,6,6,6,7,7,7,7,7,8,9,9,9,9,9,10,10,10,11,12,12,12,12,12,13,13,15
The 17th number is the median = 9.
c.
The median is the middle number (19th) once the data have been arranged in order because n = 37 is
odd. The No dosage data arranged in order are:
0,1,1,1,3,3,4,4,5,5,5,6,6,6,6,7,7,7,7,7,7,7,7,8,8,8,8,8,8,9,9,9,9,10,11,12,12
The 19th number is the median = 7.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.50
35
d.
Since the median for the Honey dosage is larger than the other two, it appears that the honey dosage
leads to more improvement than the other two treatments.
a.
The mean dioxide level is x ๏ฝ
3.3 ๏ซ 0.5 ๏ซ 1.3 ๏ซ ๏ ๏ซ 4.0 29
๏ฝ
๏ฝ 1.81 . The average dioxide amount is
16
16
1.81.
b.
Since the number of observations is even, the median is the average of the middle 2 numbers once the
data are arranged in order. The data arranged in order are:
0.1 0.2 0.2 0.3 0.4 0.5 0.5 1.3 1.4 2.4 2.4 3.3 4.0 4.0 4.0 4.0
The median is
1.3 ๏ซ 1.4 2.7
๏ฝ
๏ฝ 1.35 . Half of the dioxide levels are below 1.35 and half are above
2
2
1.35.
c.
The mode is the number that occurs the most. For this data set the mode is 4.0. The most frequent
level of dioxide is 4.0.
d.
Since the number of observations is even, the median is the average of the middle 2 numbers once the
data are arranged in order. The data arranged in order are:
0.1 0.3 1.4 2.4 2.4 3.3 4.0 4.0 4.0 4.0
The median is
e.
2.4 ๏ซ 3.3 5.7
๏ฝ
๏ฝ 2.85 .
2
2
Since the number of observations is even, the median is the average of the middle 2 numbers once the
data are arranged in order. The data arranged in order are:
0.2 0.2 0.4 0.5 0.5 1.3
The median is
2.51
0.4 ๏ซ 0.5 0.9
๏ฝ
๏ฝ 0.45 .
2
2
f.
The median level of dioxide when crude oil is present is 0.45. The median level of dioxide when
crude oil is not present is 2.85. It is apparent that the level of dioxide is much higher when crude oil
is not present.
a.
Skewed to the right. There will be a few people with very high salaries such as the president and
football coach.
b.
Skewed to the left. On an easy test, most students will have high scores with only a few low scores.
c.
Skewed to the right. On a difficult test, most students will have low scores with only a few high
scores.
d.
Skewed to the right. Most students will have a moderate amount of time studying while a few students
might study a long time.
e.
Skewed to the left. Most cars will be relatively new with a few much older.
f.
Skewed to the left. Most students will take the entire time to take the exam while a few might leave
early.
Copyright ยฉ 2018 Pearson Education, Inc.
36
Chapter 2
2.52
a.
The sample means is:
x๏ฝ
๏ฅ x ๏ฝ 3.58 ๏ซ 3.48 ๏ซ 3.27 ๏ซ ๏๏๏ ๏ซ 1.17 ๏ฝ 77.07 ๏ฝ 1.927
n
40
40
The median is found as the 20th and 21st observations, once the data have been ordered. The 20th and
21st observations are 1.75 and 1.76. The median is:
1.75 ๏ซ 1.76 3.51
๏ฝ
๏ฝ 1.755
2
2
The mode is the number that occurs the most and is 1.4, which occurs 3 times.
b.
The sample average driving performance index is 1.927. The median driving performance index is
1.755. Half of all driving performance indexes are less than 1.755 and half are higher. The most
common driving performance index value is 1.4.
c.
Since the mean is larger than the median, the data are skewed to the right. Using MINITAB, a
histogram of the driving performance index values is:
Histogram of INDEX
10
Frequency
8
6
4
2
0
2.53
1.5
2.0
2.5
INDEX
3.0
3.5
For the “Joint exchange offer with prepack” firms, the mean time is 2.6545 months, and the median is 1.5
months. Thus, the average time spent in bankruptcy for “Joint” firms is 2.6545 months, while half of the
firms spend 1.5 months or less in bankruptcy.
For the “No prefiling vote held” firms, the mean time is 4.2364 months, and the median is 3.2 months.
Thus, the average time spent in bankruptcy for “No prefiling vote held” firms is 4.2364 months, while half
of the firms spend 3.2 months or less in bankruptcy.
For the “Prepack solicitation only” firms, the mean time is 1.8185 months, and the median is 1.4 months.
Thus, the average time spent in bankruptcy for “Prepack solicitation only” firms is 1.8185 months, while
half of the firms spend 1.4 months or less in bankruptcy.
Since the means and medians for the three groups of firms differ quite a bit, it would be unreasonable to use
a single number to locate the center of the time in bankruptcy. Three different “centers” should be used.
2.54
a.
The mean is ๏ญ ๏ฝ ๏ฅ ๏ฝ
x
n
2 ๏ซ 1 ๏ซ 1 ๏ซ ๏1 62
๏ฝ
๏ฝ 2.067 . The average number of nuclear power plants per
30
30
state for states that have nuclear power plants is 2.067.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
37
The median is found by first arranging the data in order from smallest to largest:
1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 4 4 5 6
Since there are an even number of data points, the median is the average of the middle two numbers
which is
2๏ซ2
๏ฝ 2 . Half of the states with nuclear power plants have 2 or fewer plants.
2
The mode is 1. Most states that have nuclear power plants have just 1.
b.
For regulated states: The mean is ๏ญ ๏ฝ ๏ฅ ๏ฝ
x
n
2 ๏ซ 1 ๏ซ 1 ๏ซ ๏1 31
๏ฝ
๏ฝ 1.824 . The average number of
30
17
nuclear power plants per state for states that have nuclear power plants is 1.824.
The median is found by first arranging the data in order from smallest to largest:
1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 4
Since there are an odd number of data points, the median is the middle number which is 2. Half of
the states with nuclear power plants have 2 or fewer plants.
The mode is 1 and 2. Most states that have nuclear power plants have 1 or 2.
c.
For deregulated states: The mean is ๏ญ ๏ฝ ๏ฅ ๏ฝ
x
n
1 ๏ซ 1 ๏ซ 1 ๏ซ ๏1 31
๏ฝ
๏ฝ 2.385 . The average number of
13
13
nuclear power plants per state for states that have nuclear power plants is 2.385.
The median is found by first arranging the data in order from smallest to largest:
1 1 1 1 1 1 2 2 3 3 4 5 6
Since there are an odd number of data points, the median is the middle number which is 2. Half of
the states with nuclear power plants have 2 or fewer plants.
The mode is 1. Most states that have nuclear power plants have 1.
d.
e.
Because the average number of nuclear power plants in states that are deregulated is greater than the
average number of nuclear power plants in states that are regulated, it appears that regulations limits
the number of nuclear power plants.
After deleting the largest observation, the mean is ๏ญ ๏ฝ ๏ฅ ๏ฝ
x
n
2 ๏ซ 1 ๏ซ 1 ๏ซ ๏1 56
๏ฝ
๏ฝ 1.931 . The average
30
29
number of nuclear power plants per state for states that have nuclear power plants is 1.931.
The median is found by first arranging the data in order from smallest to largest:
1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 4 4 5
Since there are an odd number of data points, the median is the middle number which is 2. Half of
the states with nuclear power plants have 2 or fewer plants.
The mode is 1. Most states that have nuclear power plants have just 1.
By deleting the largest observation, the mean decrease, but the median and mode remain the same.
Copyright ยฉ 2018 Pearson Education, Inc.
38
2.55
2.56
Chapter 2
f.
The trimmed mean is ๏ญ ๏ฝ ๏ฅ ๏ฝ
a.
extreme values.
Due to the “elite” superstars, the salary distribution is skewed to the right. Since this implies that the
median is less than the mean, the players’ association would want to use the median.
b.
The owners, by the logic of part a, would want to use the mean.
a.
The primary disadvantage of using the range to compare variability of data sets is that the two data
sets can have the same range and be vastly different with respect to data variation. Also, the range is
greatly affected by extreme measures.
The sample variance is the sum of the squared deviations of the observations from the sample mean
divided by the sample size minus 1. The population variance is the sum of the squared deviations of
the values from the population mean divided by the population size.
b.
c.
x
n
2 ๏ซ 1 ๏ซ 1 ๏ซ ๏1 49
๏ฝ
๏ฝ 1.885 . The trimmed mean is not affected by
26
26
The variance of a data set can never be negative. The variance of a sample is the sum of the squared
deviations from the mean divided by n ๏ญ 1. The square of any number, positive or negative, is
always positive. Thus, the variance will be positive.
The variance is usually greater than the standard deviation. However, it is possible for the variance to
be smaller than the standard deviation. If the data are between 0 and 1, the variance will be smaller
than the standard deviation. For example, suppose the data set is .8, .7, .9, .5, and .3. The sample
mean is:
x๏ฝ
๏ฅ x ๏ฝ .8 ๏ซ .7 ๏ซ .9 ๏ซ .5 ๏ซ .3 ๏ฝ 3.2 ๏ฝ .64
n
.5
5
The sample variance is: s 2 ๏ฝ
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
n
2
๏ฝ
3.22
13 ๏ฝ .232 ๏ฝ .058
5 ๏ญ1
4
2.28 ๏ญ
The standard deviation is s ๏ฝ .058 ๏ฝ .241
2.57
a.
Range = 4 ๏ญ 0 = 4
s2 ๏ฝ
b.
2
n ๏ญ1
82
5 ๏ฝ 2.3
๏ฝ
5 ๏ญ1
22 ๏ญ
n
s ๏ฝ 2.3 ๏ฝ 1.52
Range = 6 ๏ญ 0 = 6
s2 ๏ฝ
c.
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
17 2
7 ๏ฝ 3.619
7 ๏ญ1
s ๏ฝ 3.619 ๏ฝ 1.9
302
10 ๏ฝ 7.111
10 ๏ญ 1
s ๏ฝ 7.111 ๏ฝ 2.67
63 ๏ญ
Range = 8 ๏ญ (๏ญ2) = 10
s2 ๏ฝ
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
n
2
๏ฝ
154 ๏ญ
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
d.
Range = 1 ๏ญ (๏ญ3) = 4
s2 ๏ฝ
2.58
a.
b.
2.59
s2 ๏ฝ
s2 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
n ๏ญ1
๏ฅ
n
๏จ๏ฅ x๏ฉ
x ๏ญ
๏ฝ
2
๏ฅ x2 ๏ญ
n
๏จ๏ฅ x๏ฉ
n ๏ญ1
๏ฝ
2
1002
40 ๏ฝ 3.3333
40 ๏ญ 1
380 ๏ญ
n
a.
๏ฅ x ๏ฝ 3 ๏ซ 1 ๏ซ 10 ๏ซ 10 ๏ซ 4 ๏ฝ 28
s2 ๏ฝ
s2 ๏ฝ
s ๏ฝ .1868 ๏ฝ .432
๏ฅ x ๏ฝ 3 ๏ซ 1 ๏ซ 10 ๏ซ 10 ๏ซ 4 ๏ฝ 226
2
2
2
2
2
2
๏ฅ x ๏ฝ 28 ๏ฝ 5.6
n
5
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
282
5 ๏ฝ 69.2 ๏ฝ 17.3
5 ๏ญ1
4
226 ๏ญ
๏ฅ x ๏ฝ 8 ๏ซ 10 ๏ซ 32 ๏ซ 5 ๏ฝ 55
x๏ฝ
s ๏ฝ 3.3333 ๏ฝ 1.826
17 2
20 ๏ฝ .1868
๏ฝ
20 ๏ญ 1
s2 ๏ฝ
x๏ฝ
s ๏ฝ 4.8889 ๏ฝ 2.211
18 ๏ญ
2
n ๏ญ1
s ๏ฝ 1.395 ๏ฝ 1.18
202
10 ๏ฝ 4.8889
๏ฝ
10 ๏ญ 1
2
n
๏จ๏ฅ x๏ฉ
x ๏ญ
(๏ญ6.8) 2
17 ๏ฝ 1.395
17 ๏ญ 1
25.04 ๏ญ
84 ๏ญ
2
n ๏ญ1
๏ฅ
2
2
c.
b.
39
s ๏ฝ 17.3 ๏ฝ 4.1593
๏ฅ x ๏ฝ 8 ๏ซ 10 ๏ซ 32 ๏ซ 5 ๏ฝ 1213
2
2
2
2
2
๏ฅ x ๏ฝ 55 ๏ฝ 13.75 feet
n
4
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
552
4 ๏ฝ 456.75 ๏ฝ 152.25 square feet
4 ๏ญ1
3
1213 ๏ญ
s ๏ฝ 152.25 ๏ฝ 12.339 feet
c.
๏ฅ x ๏ฝ ๏ญ1 ๏ซ (๏ญ4) ๏ซ (๏ญ3) ๏ซ 1 ๏ซ (๏ญ4) ๏ซ (๏ญ4) ๏ฝ ๏ญ15 ๏ฅ x ๏ฝ (๏ญ1) ๏ซ (๏ญ4) ๏ซ (๏ญ3) ๏ซ 1 ๏ซ (๏ญ4) ๏ซ (๏ญ4) ๏ฝ 59
2
x๏ฝ
s2 ๏ฝ
2
2
2
2
๏ฅ x ๏ฝ ๏ญ15 ๏ฝ ๏ญ2.5
n
6
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
(๏ญ15) 2
6 ๏ฝ 21.5 ๏ฝ 4.3
6 ๏ญ1
5
59 ๏ญ
Copyright ยฉ 2018 Pearson Education, Inc.
s ๏ฝ 4.3 ๏ฝ 2.0736
2
2
40
Chapter 2
d.
๏ฅx ๏ฝ
x๏ฝ
s2 ๏ฝ
2.60
a.
2
2
2
2
๏ฅ x ๏ฝ 2 ๏ฝ 1 ๏ฝ .33 ounce
n
6
3
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
24 22
๏ญ
.2933
๏ฝ 25 6 ๏ฝ
๏ฝ .0587 square ounce
6 ๏ญ1
5
s ๏ฝ .0587 ๏ฝ .2422 ounce
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
1992
5 ๏ฝ 3.7
5 ๏ญ1
s ๏ฝ 3.7 ๏ฝ 1.92
3032
9 ๏ฝ 1,949.25
9 ๏ญ1
s ๏ฝ 1,949.25 ๏ฝ 44.15
2952
8 ๏ฝ 1,307.84
8 ๏ญ1
s ๏ฝ 1,307.84 ๏ฝ 36.16
7935 ๏ญ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
25, 795 ๏ญ
Range = 100 ๏ญ 2 = 98
s2 ๏ฝ
2.61
2
Range = 100 ๏ญ 1 = 99
s2 ๏ฝ
c.
2
24
๏ฆ1๏ถ ๏ฆ1๏ถ ๏ฆ1๏ถ ๏ฆ2๏ถ ๏ฆ1๏ถ ๏ฆ4๏ถ
๏ฅ x 2 ๏ฝ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ฝ 25 ๏ฝ .96
Range = 42 ๏ญ 37 = 5
s2 ๏ฝ
b.
1 1 1 2 1 4 10
๏ซ ๏ซ ๏ซ ๏ซ ๏ซ ๏ฝ
๏ฝ2
5 5 5 5 5 5 5
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
n
2
๏ฝ
20, 033 ๏ญ
This is one possibility for the two data sets.
Data Set 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Data Set 2: 0, 0, 1, 1, 2, 2, 3, 3, 9, 9
The two sets of data above have the same range = largest measurement ๏ญ smallest measurement = 9 ๏ญ 0 = 9.
The means for the two data sets are:
x1 ๏ฝ
๏ฅ x ๏ฝ 0 ๏ซ 1 ๏ซ 2 ๏ซ 3 ๏ซ 4 ๏ซ 5 ๏ซ 6 ๏ซ 7 ๏ซ 8 ๏ซ 9 ๏ฝ 45 ๏ฝ 4.5
x2 ๏ฝ
๏ฅ x ๏ฝ 0 ๏ซ 0 ๏ซ 1 ๏ซ 1 ๏ซ 2 ๏ซ 2 ๏ซ 3 ๏ซ 3 ๏ซ 9 ๏ซ 9 ๏ฝ 30 ๏ฝ 3
n
n
10
10
10
10
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
The dot diagrams for the two data sets are shown below.
Dotplot of x1, x2
x1
0
2
x
4
6
8
6
8
x2
0
2.62
x
2
4
This is one possibility for the two data sets.
Data Set 1: 1, 1, 2, 2, 3, 3, 4, 4, 5, 5
Data Set 2: 1, 1, 1, 1, 1, 5, 5, 5, 5, 5
x1 ๏ฝ
๏ฅ x ๏ฝ 1 ๏ซ 1 ๏ซ 2 ๏ซ 2 ๏ซ 3 ๏ซ 3 ๏ซ 4 ๏ซ 4 ๏ซ 5 ๏ซ 5 ๏ฝ 30 ๏ฝ 3
x2 ๏ฝ
๏ฅ x ๏ฝ 1 ๏ซ 1 ๏ซ 1 ๏ซ 1 ๏ซ 1 ๏ซ 5 ๏ซ 5 ๏ซ 5 ๏ซ 5 ๏ซ 5 ๏ฝ 30 ๏ฝ 3
n
10
n
10
10
10
Therefore, the two data sets have the same mean. The variances for the two data sets are:
s12 ๏ฝ
s22 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
n ๏ญ1
๏ฅ
2
2
n
๏จ๏ฅ x๏ฉ
x ๏ญ
๏ฝ
2
2
n ๏ญ1
n
๏ฝ
302
10 ๏ฝ 20 ๏ฝ 2.2222
9
9
110 ๏ญ
302
10 ๏ฝ 40 ๏ฝ 4.4444
9
9
130 ๏ญ
Copyright ยฉ 2018 Pearson Education, Inc.
41
42
Chapter 2
The dot diagrams for the two data sets are shown below.
Dotplot of x1, x2
x1
x
1
2
3
x2
1
2
3
4
5
4
5
x
2.63
a.
Range = 3 ๏ญ 0 = 3
s2 ๏ฝ
b.
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
n ๏ญ1
72
5 ๏ฝ 1.3
๏ฝ
5 ๏ญ1
15 ๏ญ
2
n
s ๏ฝ 1.3 ๏ฝ 1.14
After adding 3 to each of the data points,
Range = 6 ๏ญ 3 = 3
s2 ๏ฝ
c.
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
2
n
๏ฝ
222
5 ๏ฝ 1.3
5 ๏ญ1
102 ๏ญ
s ๏ฝ 1.3 ๏ฝ 1.14
After subtracting 4 from each of the data points,
Range = ๏ญ1 ๏ญ (๏ญ4) = 3
s2 ๏ฝ
d.
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
(๏ญ13) 2
5 ๏ฝ 1.3
5 ๏ญ1
39 ๏ญ
s ๏ฝ 1.3 ๏ฝ 1.14
The range, variance, and standard deviation remain the same when any number is added to or
subtracted from each measurement in the data set.
2.64
The ecolabel that had the most variation in the numerical responses is Audubon International because it has
the largest standard deviation.
2.65
a.
The range of permeability scores for group A sandstone slices is
Range ๏ฝ max ๏ญ min ๏ฝ 122.4 ๏ญ 55.2 ๏ฝ 67.2 .
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
b.
The variance of group A sandstone slices is s 2 ๏ฝ
๏ฅ
๏จ ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
43
7,362.32
100 ๏ฝ 209.5292 .
100 ๏ญ 1
562,778 ๏ญ
The standard deviation is s ๏ฝ 209.5292 ๏ฝ 14.475 .
2.66
c.
Condition B has the largest range and the largest standard deviation. Thus, condition B has more
variable permeability data.
a.
The range in the difference between the maximum and minimum values. The range
๏ฝ 24.8 โ ๏จ ๏ญ1.6 ๏ฉ ๏ฝ 26.4 . The units of measurement are percents.
b.
The variance is
s2 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
140.7 2
13 ๏ฝ 2236.41 ๏ญ 1522.8069 ๏ฝ 713.6031 ๏ฝ 59.4669
13 ๏ญ 1
12
12
2236.41 ๏ญ
The units are square percents.
2.67
2.68
c.
The standard deviation is s ๏ฝ 59.4669 ๏ฝ 7.7115 . The units are percents.
a.
The range is 155. The statement is accurate.
b.
The variance is 722.036. The statement is not accurate. A more accurate statement would be: โThe
variance of the levels of supports for corporate sustainability for the 992 senior managers is 722.036.โ
c.
The standard deviation is 26.871. If the units of measure for the two distributions are the same, then
the distribution of support levels for the 992 senior managers has less variation than a distribution
with a standard deviation of 50. If the units of measure for the second distribution is not known, then
we cannot compare the variation in the two distributions by looking at the standard deviations alone.
d.
The standard deviation best describes the variation in the distribution. The range can be greatly
affected by extreme measures. The variance is measured in square units which is hard to interpret.
Thus, the standard deviation is the best measure to describe the variation.
a.
The sample variance of the honey dosage group is:
s2 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
3752
35 ๏ฝ 277.142857 ๏ฝ 8.1512605
35-1
34
4295-
The standard deviation is: s ๏ฝ 8.1512605 ๏ฝ 2.855
b.
The sample variance of the DM dosage group is:
s2 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
2752
33 ๏ฝ 339.33333 ๏ฝ 10.604167
33-1
32
2631-
The standard deviation is: s ๏ฝ 10.604167 ๏ฝ 3.256
Copyright ยฉ 2018 Pearson Education, Inc.
44
Chapter 2
c.
The sample variance of the control group is:
s2 ๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
2412
37 ๏ฝ 311.243243 ๏ฝ 8.6456456
37-1
36
1881-
The standard deviation is: s ๏ฝ 8.6456456 ๏ฝ 2.940
2.69
d.
The group with the most variability is the group with the largest standard deviation, which is the DM
group. The group with the least variability is the group with the smallest standard deviation, which is
the honey group.
a.
The range is the largest observation minus the smallest observation or 6 โ 1 = 5.
2
๏ฆ
๏ถ
๏ง๏จ ๏ฅ xi ๏ท๏ธ
62 2
xi2 ๏ญ i
178 ๏ญ
๏ฅ
n
30 ๏ฝ 1.7195
The variance is: s 2 ๏ฝ i
๏ฝ
30 ๏ญ 1
n ๏ญ1
The standard deviation is: s ๏ฝ s 2 ๏ฝ 1.7195 ๏ฝ 1.311
b.
The largest observation is 6. It is deleted from the data set. The new range is: 5 โ 1 = 4.
2
๏ฆ
๏ถ
๏ง๏จ ๏ฅ xi ๏ท๏ธ
56 2
xi2 ๏ญ i
142 ๏ญ
๏ฅ
n
29 ๏ฝ 1.2094
The variance is: s 2 ๏ฝ i
๏ฝ
29 ๏ญ 1
n ๏ญ1
The standard deviation is: s ๏ฝ s 2 ๏ฝ 1.2094 ๏ฝ 1.100
When the largest observation is deleted, the range, variance and standard deviation decrease.
c.
The largest observation is 6 and the smallest is 1. When these two observations are deleted from the
data set, the new range is: 5 โ 1 = 4.
2
๏ฆ
๏ถ
๏ง๏จ ๏ฅ xi ๏ท๏ธ
552
xi2 ๏ญ i
141 ๏ญ
๏ฅ
n
28 ๏ฝ 1.2209
The variance is: s 2 ๏ฝ i
๏ฝ
28 ๏ญ 1
n ๏ญ1
The standard deviation is: s ๏ฝ s 2 ๏ฝ 1.2209 ๏ฝ 1.1049
2.70
a.
When the largest and smallest observations are deleted, the range, variance and standard deviation
decrease.
A worker’s overall time to complete the operation under study is determined by adding the subtasktime averages.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
Worker A
The average for subtask 1 is: x ๏ฝ
๏ฅ x ๏ฝ 211 ๏ฝ 30.14
The average for subtask 2 is: x ๏ฝ
๏ฅ x ๏ฝ 21 ๏ฝ 3
n
45
7
n
7
Worker A’s overall time is 30.14 + 3 = 33.14.
Worker B
The average for subtask 1 is: x ๏ฝ
๏ฅ x ๏ฝ 213 ๏ฝ 30.43
The average for subtask 2 is: x ๏ฝ
๏ฅ x ๏ฝ 29 ๏ฝ 4.14
n
7
n
7
Worker B’s overall time is 30.43 + 4.14 = 34.57.
b.
Worker A
s๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
2112
7 ๏ฝ 15.8095 ๏ฝ 3.98
7 ๏ญ1
6455 ๏ญ
Worker B
s๏ฝ
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
2
n
๏ฝ
2132
7 ๏ฝ .9524 ๏ฝ .98
7 ๏ญ1
6487 ๏ญ
c.
The standard deviations represent the amount of variability in the time it takes the worker to complete
subtask 1.
d.
Worker A
s๏ฝ
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
212
7 ๏ฝ .6667 ๏ฝ .82
7 ๏ญ1
67 ๏ญ
Worker B
s๏ฝ
e.
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
292
7 ๏ฝ 4.4762 ๏ฝ 2.12
7 ๏ญ1
147 ๏ญ
I would choose workers similar to worker B to perform subtask 1. Worker B has a slightly higher
average time on subtask 1 (A: x ๏ฝ 30.14 , B: x ๏ฝ 30.43 ). However, Worker B has a smaller
variability in the time it takes to complete subtask 1 (part b). He or she is more consistent in the time
needed to complete the task.
I would choose workers similar to Worker A to perform subtask 2. Worker A has a smaller average
time on subtask 2 (A: x ๏ฝ 3 , B: x ๏ฝ 4.14 ). Worker A also has a smaller variability in the time
needed to complete subtask 2 (part d).
2.71
a.
The unit of measurement of the variable of interest is dollars (the same as the mean and standard
deviation). Based on this, the data are quantitative.
Copyright ยฉ 2018 Pearson Education, Inc.
46
Chapter 2
b.
Since no information is given about the shape of the data set, we can only use Chebyshev’s Rule.
$900 is 2 standard deviations below the mean, and $2100 is 2 standard deviations above the mean.
Using Chebyshev’s Rule, at least 3/4 of the measurements (or 3/4 ๏ด 200 = 150 measurements) will
fall between $900 and $2100.
$600 is 3 standard deviations below the mean and $2400 is 3 standard deviations above the mean.
Using Chebyshev’s Rule, at least 8/9 of the measurements (or 8/9 ๏ด 200 ๏ป 178 measurements) will
fall between $600 and $2400.
$1200 is 1 standard deviation below the mean and $1800 is 1 standard deviation above the mean.
Using Chebyshev’s Rule, nothing can be said about the number of measurements that will fall
between $1200 and $1800.
$1500 is equal to the mean and $2100 is 2 standard deviations above the mean. Using Chebyshev’s
Rule, at least 3/4 of the measurements (or 3/4 ๏ด 200 = 150 measurements) will fall between $900 and
$2100. It is possible that all of the 150 measurements will be between $900 and $1500. Thus,
nothing can be said about the number of measurements between $1500 and $2100.
2.72
2.73
2.74
Since no information is given about the data set, we can only use Chebyshev’s Rule.
a.
Nothing can be said about the percentage of measurements which will fall between
x ๏ญ s and x ๏ซ s .
b.
At least 3/4 or 75% of the measurements will fall between x ๏ญ 2s and x ๏ซ 2s .
c.
At least 8/9 or 89% of the measurements will fall between x ๏ญ 3s and x ๏ซ 3s .
According to the Empirical Rule:
a.
Approximately 68% of the measurements will be contained in the interval x ๏ญ s to x ๏ซ s .
b.
Approximately 95% of the measurements will be contained in the interval x ๏ญ 2s to x ๏ซ 2s .
c.
Essentially all the measurements will be contained in the interval x ๏ญ 3s to x ๏ซ 3s .
a.
x๏ฝ
s2 ๏ฝ
๏ฅ x ๏ฝ 206 ๏ฝ 8.24
n
25
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
2062
25 ๏ฝ 3.357
25 ๏ญ 1
1778 ๏ญ
s ๏ฝ 3.357 ๏ฝ 1.83
b.
Number of Measurements
in Interval
Interval
Percentage
x ๏ฑ s , or (6.41, 10.07)
18
18 / 25 ๏ฝ .72 or 72%
x ๏ฑ 2s , or (4.58, 11.90)
24
24 / 25 ๏ฝ .96 or 96%
x ๏ฑ 3s , or (2.75, 13.73)
25
25 / 25 ๏ฝ 1.00 or 100%
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
c.
The percentages in part b are in agreement with Chebyshev’s Rule and agree fairly well with the
percentages given by the Empirical Rule.
d.
Range ๏ฝ 12 ๏ญ 5 ๏ฝ 7 and s ๏ป
47
Range 7
๏ฝ ๏ฝ 1.75
4
4
The range approximation provides a satisfactory estimate of s ๏ฝ 1.83 from part a.
2.75
Using Chebyshev’s Rule, at least 8/9 of the measurements will fall within 3 standard deviations of the
mean. Thus, the range of the data would be around 6 standard deviations. Using the Empirical Rule,
approximately 95% of the observations are within 2 standard deviations of the mean. Thus, the range of
the data would be around 4 standard deviations. We would expect the standard deviation to be somewhere
between Range/6 and Range/4.
For our data, the range ๏ฝ 760 ๏ญ 135 ๏ฝ 625 .
The
Range 625
Range 625
๏ฝ
๏ฝ 156.25 .
๏ฝ
๏ฝ 104.17 and
6
6
4
4
Therefore, I would estimate that the standard deviation of the data set is between 104.17 and 156.25.
It would not be feasible to have a standard deviation of 25. If the standard deviation were 25, the data
would span 625/25 = 25 standard deviations. This would be extremely unlikely.
2.76
a.
z๏ฝ
263 ๏ญ 353
๏ฝ ๏ญ3 A score of 263 would be 3 standard deviations below the mean.
30
z๏ฝ
443 ๏ญ 353
๏ฝ 3 A score of 443 would be 3 standard deviations above the mean.
30
Using Chebyshevโs Rule, at least 8/9 of the observations will be within 3 standard deviations of the
mean.
b.
For a mound-shaped, symmetric distribution, approximately 99.7% of the observations will be within
3 standard deviations of the mean, using the Empirical Rule.
c.
z๏ฝ
109 ๏ญ 184
๏ฝ ๏ญ3 A score of 109 would be 3 standard deviations below the mean.
25
z๏ฝ
259 ๏ญ 184
๏ฝ 3 A score of 259 would be 3 standard deviations above the mean.
25
d.
2.77
a.
Using Chebyshevโs Rule, at least 8/9 of the observations will be within 3 standard deviations of the
mean.
For a mound-shaped, symmetric distribution, approximately 99.7% of the observations will be within
3 standard deviations of the mean, using the Empirical Rule.
Because the distribution is skewed, we will use Chebyshevโs Rule. At least 8/9 of the observations
will be within 3 standard deviations of the mean:
x A ๏ฑ 3s A ๏ 73.62 ๏ฑ 3๏จ14.48๏ฉ ๏ 73.62 ๏ฑ 43.44 ๏ ๏จ 30.18, 117.06๏ฉ
b.
Because the distribution is skewed, we will use Chebyshevโs Rule. At least 8/9 of the observations
Copyright ยฉ 2018 Pearson Education, Inc.
48
Chapter 2
will be within 3 standard deviations of the mean:
x A ๏ฑ 3s A ๏ 128.54 ๏ฑ 3๏จ 21.97 ๏ฉ ๏ 128.54 ๏ฑ 65.91 ๏ ๏จ 62.63, 194.45๏ฉ
c.
Because the distribution is skewed, we will use Chebyshevโs Rule. At least 8/9 of the observations
will be within 3 standard deviations of the mean:
x A ๏ฑ 3s A ๏ 83.07 ๏ฑ 3๏จ 20.05๏ฉ ๏ 83.07 ๏ฑ 60.15 ๏ ๏จ 22.92, 143.22๏ฉ
2.78
d.
Although all the intervals overlap, it appears that weathering group B results in faster decay because
the sample mean is higher and the upper limit of the interval is much higher than the upper limit for
the other two weathering types.
a.
Using MINITAB, the histogram of the data is:
Histogram of Wheels
12
10
Frequency
8
6
4
2
0
1
2
3
4
5
6
7
8
Wheels
Since the distribution is skewed to the right, it is not mound-shaped and it is not symmetric.
b.
Using MINITAB, the results are:
Descriptive Statistics: Wheels
Variable
Wheels
N
28
Mean
3.214
StDev
1.371
Minimum
1.000
Q1
2.000
Median
3.000
Q3
4.000
Maximum
8.000
The mean is 3.214 and the standard deviation is 1.371.
2.79
c.
The interval is: x ๏ฑ 2 s ๏ 3.214 ๏ฑ 2(1.371) ๏ 3.214 ๏ฑ 2.742 ๏ (0.472, 5.956) .
d.
According to Chebyshevโs rule, at least 75% of the observations will fall within 2 standard deviations
of the mean.
e.
According to the Empirical Rule, approximately 95% of the observations will fall within 2 standard
deviations of the mean.
f.
Actually, 26 of the 28 or 26/28 = .929 of the observations fall within the interval. This value is close
to the 95% that we would expect with the Empirical Rule.
a.
The interval x ๏ฑ 2s will contain at least 75% of the observations. This interval is
x ๏ฑ 2s ๏ 3.11 ๏ฑ 2(.66) ๏ 3.11 ๏ฑ 1.32 ๏ (1.79, 4.43) .
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.80
49
b.
No. The value 1.25 does not fall in the interval x ๏ฑ 2s . We know that at least 75% of all observations
will fall within 2 standard deviations of the mean. Since 1.25 falls more than 2 standard deviations
from the mean, it would not be a likely value to observe.
a.
Since the data are mound-shaped and symmetric, we know from the Empirical Rule that
approximately 95% of the observations will fall within 2 standard deviations of the mean. This
interval will be: x ๏ฑ 2s ๏ 39 ๏ฑ 2(6) ๏ 39 ๏ฑ 12 ๏ (27, 51) .
b.
We know that approximately .05 of the observations will fall outside the range 27 to 51. Since the
distribution of scores is symmetric, we know that half of the .05 or .025 will fall above 51.
c.
We know from the Empirical Rule that approximately 99.7% (essentially all) of the observations will
fall within 3 standard deviations of the mean. This interval is:
x ๏ฑ 3s ๏ 39 ๏ฑ 3(6) ๏ 39 ๏ฑ 18 ๏ (21, 57) .
n
2.81
a.
๏ฅx
The sample mean is: x ๏ฝ i ๏ฝ1
n
i
๏ฝ
18,482
๏ฝ 94.78
195
2
๏ฆ n ๏ถ
๏ง๏จ ๏ฅ xi ๏ท๏ธ
n
2
18, 4822
x ๏ญ i ๏ฝ1
1,756,550 ๏ญ
๏ฅ
n
195 ๏ฝ 24.9254
The sample variance is: s 2 ๏ฝ i ๏ฝ1
๏ฝ
195 ๏ญ 1
n ๏ญ1
The standard deviation is: s ๏ฝ s 2 ๏ฝ 24.9254 ๏ฝ 4.9925
b.
x ๏ฑ s ๏ 94.78 ๏ฑ 4.99 ๏ (89.79, 99.77)
x ๏ฑ 2s ๏ 94.78 ๏ฑ 2(4.99) ๏ 94.78 ๏ฑ 9.98 ๏ (84.80, 104.76)
x ๏ฑ 3s ๏ 94.78 ๏ฑ 3(4.99) ๏ 94.78 ๏ฑ 14.97 ๏ (79.81, 109.75)
c.
There are 143 out of 195 observations in the first interval. This is (143 / 195) ๏ด 100% ๏ฝ 73.3% . There
are 189 out of 195 observations in the second interval. This is (189 / 195) ๏ด 100% ๏ฝ 96.9% . There are
191 out of 195 observations in the second interval. This is (191 / 195) ๏ด 100% ๏ฝ 97.9% .
The percentages for the first 2 intervals are somewhat larger than we would expect using the
Empirical Rule. The Empirical Rule indicates that approximately 68% of the observations will fall
within 1 standard deviation of the mean. It also indicates that approximately 95% of the observations
will fall within 2 standard deviations of the mean. Chebyshevโs Theorem says that at least ยพ or 75%
of the observations will fall within 2 standard deviations of the mean and at least 8/9 or 88.9% of the
observations will fall within 3 standard deviations of the mean. It appears that our observed
percentages agree with Chebyshevโs Theorem better than the Empirical Rule.
2.82
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: Deaths
Variable
Deaths
N
27
Mean
163.4
StDev
227.4
Minimum
4.0
Q1
29.0
Median
68.0
Q3
184.0
Maximum
955.0
Since the data are not mound-shaped, we will use Chebyshevโs Rule. Most of the observations (8/9) will
fall within 3 standard deviations of the mean. This interval is:
Copyright ยฉ 2018 Pearson Education, Inc.
50
Chapter 2
x ๏ฑ 3s ๏ 163.4 ๏ฑ 3(227.4) ๏ 163.4 ๏ฑ 682.2 ๏ ( ๏ญ518.8, 845.6) . Since no observations can be negative,
then most observations will fall between 0 and 845.6.
2.83
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: Q2
Variable
Q2
Q1
No
Undecided
Yes
N
1
5
30
Mean
2.0000
4.800
3.967
StDev
*
0.447
0.850
Minimum
2.0000
4.000
2.000
Q1
*
4.500
3.000
Median
2.0000
5.000
4.000
Q3
*
5.000
5.000
Maximum
2.0000
5.000
5.000
The data for those users who believe there should be national standards is close to being mound-shaped and
symmetric. Therefore, we will use the Empirical Rule. Approximately 95% of the observations fall within
2 standard deviations of the mean. This interval is:
x ๏ฑ 2s ๏ 3.967 ๏ฑ 2(.85) ๏ 3.967 ๏ฑ 1.70 ๏ (2.267, 5.667)
2.84
a.
The average ranking for contestants with a first degree who competed for a job with Lord Sugar is
7.796.
b.
Approximately 95% of the observations will fall within 2 standard deviations of the mean. This
interval is:
x ๏ฑ 2 s ๏ 7.796 ๏ฑ 2(4.231) ๏ 7.796 ๏ฑ 8.462 ๏ ( ๏ญ.666, 16.258) Since no observations can be
negative, the interval will be 0 to 16.258.
2.85
2.86
c.
No. It appears that just the opposite is true. When the prize was a job, the higher the education level
of the contestant, the higher the mean ratting. When the prize was a partnership, the higher the
education level of the contestant, the lower mean the rating.
a.
The interval x ๏ฑ 2s for the flexed arm group is x ๏ฑ 2s ๏ 59 ๏ฑ 3(4) ๏ 59 ๏ฑ 12 ๏ (47, 71) . The interval
for the extended are group is x ๏ฑ 2s ๏ 43 ๏ฑ 3(2) ๏ 43 ๏ฑ 6 ๏ (37, 49) . We know that at least 8/9 or
88.9% of the observations will fall within 3 standard deviations of the mean using Chebyshevโs Rule.
Since these 2 intervals barely overlap, the information supports the researchersโ theory. The shoppers
from the flexed arm group are more likely to select vice options than the extended arm group.
b.
The interval x ๏ฑ 2s for the flexed arm group is x ๏ฑ 2s ๏ 59 ๏ฑ 2(10) ๏ 59 ๏ฑ 20 ๏ (39, 79) . The
interval for the extended are group is x ๏ฑ 2 s ๏ 43 ๏ฑ 2(15) ๏ 43 ๏ฑ 30 ๏ (13, 73) . Since these two
intervals overlap almost completely, the information does not support the researcherโs theory. There
does not appear to be any difference between the two groups.
a.
Yes. The distribution of the buy-side analysts is fairly flat and skewed to the right. The distribution
of the sell-side analysts is more mound shaped and is not spread out as far as the buy-side
distribution. Since the buy-side distribution is more spread out, the variance of the buy-side
distribution will be larger than the variance of the sell-side distribution. Because the buy-side
distribution is skewed to the right, the mean will be pulled to the right. Thus, the mean of the buyside distribution will be greater than the mean of the sell-side distribution.
b.
Since the sell-side distribution is fairly mound-shaped, we can use the Empirical Rule. The Empirical
Rule says that approximately 95% of the observations will fall within 2 standard deviations of the
mean. The interval for the sell-side distribution would be:
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
51
x ๏ฑ 2 s ๏ ๏ญ.05 ๏ฑ 2(.85) ๏ ๏ญ.05 ๏ฑ 1.7 ๏ (๏ญ1.75, 1.65)
Since the buy-side distribution is skewed to the right, we cannot use the Empirical Rule. Thus, we
will use Chebyshevโs Rule. We know that at least (1 โ 1/k2) will fall within k standard deviations of
the mean. If we choose k ๏ฝ 4 , then (1 ๏ญ 1 / 4 2 ) ๏ฝ .9375 or 93.75%. This is very close to 95%
requested in the problem. The interval for the buy-side distribution to contain at least 93.75% of the
observations would be: x ๏ฑ 4s ๏ .85 ๏ฑ 4(1.93) ๏ .85 ๏ฑ 7.72 ๏ (๏ญ6.87, 8.57)
Note: This interval will contain at least 93.75% of the observations. It may contain more than
93.75% of the observations.
2.87
Since we do not know if the distribution of the heights of the trees is mound-shaped, we need to apply
Chebyshev’s Rule. We know ๏ญ ๏ฝ 30 and ๏ณ ๏ฝ 3 . Therefore, ๏ญ ๏ฑ 3๏ณ ๏ 30 ๏ฑ 3(3) ๏ 30 ๏ฑ 9 ๏ (21, 39) .
According to Chebyshev’s Rule, at least 8 / 9 ๏ฝ .89 of the tree heights on this piece of land fall within this
interval and at most 1/ 9 ๏ฝ .11 of the tree heights will fall above the interval. However, the buyer will only
1000
purchase the land if at least
๏ฝ .20 of the tree heights are at least 40 feet tall. Therefore, the buyer
5000
should not buy the piece of land.
2.88
a.
Since we do not have any idea of the shape of the distribution of SAT-Math score changes, we must
use Chebyshevโs Theorem. We know that at least 8/9 of the observations will fall within 3 standard
deviations of the mean. This interval would be:
x ๏ฑ 3s ๏ 19 ๏ฑ 3(65) ๏ 19 ๏ฑ 195 ๏ (๏ญ176, 214)
Thus, for a randomly selected student, we could be pretty sure that this studentโs score would be
anywhere from 176 points below his/her previous SAT-Math score to 214 points above his/her
previous SAT-Math score.
b.
Since we do not have any idea of the shape of the distribution of SAT-Verbal score changes, we must
use Chebyshevโs Theorem. We know that at least 8/9 of the observations will fall within 3 standard
deviations of the mean. This interval would be:
x ๏ฑ 3s ๏ 7 ๏ฑ 3(49) ๏ 7 ๏ฑ 147 ๏ (๏ญ140, 154)
Thus, for a randomly selected student, we could be pretty sure that this studentโs score would be
anywhere from 140 points below his/her previous SAT-Verbal score to 154 points above his/her
previous SAT-Verbal score.
c.
2.89
A change of 140 points on the SAT-Math would be a little less than 2 standard deviations from the
mean. A change of 140 points on the SAT-Verbal would be a little less than 3 standard deviations
from the mean. Since the 140 point change for the SAT-Math is not as big a change as the 140 point
on the SAT-Verbal, it would be most likely that the score was a SAT-Math score.
We know ๏ญ ๏ฝ 25 and ๏ณ ๏ฝ 1 . Therefore, ๏ญ ๏ฑ 2๏ณ ๏ 25 ๏ฑ 2(.1) ๏ 25 ๏ฑ .2 ๏ (24.8, 25.2)
The machine is shut down for adjustment if the contents of two consecutive bags fall more than 2 standard
deviations from the mean (i.e., outside the interval (24.8, 25.2)). Therefore, the machine was shut down
yesterday at 11:30 (25.23 and 25.25 are outside the interval) and again at 4:00 (24.71 and 25.31 are outside
the interval).
Copyright ยฉ 2018 Pearson Education, Inc.
52
Chapter 2
2.90
a.
z๏ฝ
b.
z๏ฝ
c.
z๏ฝ
d.
z๏ฝ
2.91
x ๏ญ x 40 ๏ญ 30
๏ฝ
๏ฝ 2 (sample)
s
5
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
2 standard deviations above the mean.
๏ฝ
90 ๏ญ 89
๏ฝ .5 (population)
2
.5 standard deviations above the mean.
๏ฝ
50 ๏ญ 50
๏ฝ 0 (population)
5
0 standard deviations above the mean.
x ๏ญ x 20 ๏ญ 30
๏ฝ
๏ฝ ๏ญ2.5 (sample)
s
4
2.5 standard deviations below the mean.
Using the definition of a percentile:
a.
Percentile
75th
Percentage
Above
25%
Percentage
Below
75%
b.
50th
50%
50%
c.
20th
80%
20%
d.
84th
16%
84%
2.92
QL corresponds to the 25th percentile. QM corresponds to the 50th percentile. QU corresponds to the 75th
percentile.
2.93
We first compute z-scores for each x value.
a.
z๏ฝ
b.
z๏ฝ
c.
z๏ฝ
d.
z๏ฝ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
๏ฝ
100 ๏ญ 50
๏ฝ2
25
๏ฝ
1๏ญ 4
๏ฝ ๏ญ3
1
๏ฝ
0 ๏ญ 200
๏ฝ๏ญ2
100
๏ฝ
10 ๏ญ 5
๏ฝ 1.67
3
The above z-scores indicate that the x value in part a lies the greatest distance above the mean and the
x value of part b lies the greatest distance below the mean.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.94
53
Since the element 40 has a z-score of ๏ญ2 and 90 has a z-score of 3,
๏ญ2 ๏ฝ
40 ๏ญ ๏ญ
๏ณ
and
๏ ๏ญ2๏ณ ๏ฝ 40 ๏ญ ๏ญ
๏ ๏ญ ๏ญ 2๏ณ ๏ฝ 40
๏ ๏ญ ๏ฝ 40 ๏ซ 2๏ณ
3๏ฝ
90 ๏ญ ๏ญ
๏ณ
๏ 3๏ณ ๏ฝ 90 ๏ญ ๏ญ
๏ ๏ญ ๏ซ 3๏ณ ๏ฝ 90
By substitution, 40 ๏ซ 2๏ณ ๏ซ 3๏ณ ๏ฝ 90 ๏ 5๏ณ ๏ฝ 50 ๏ ๏ณ ๏ฝ 10 and ๏ญ ๏ฝ 40 ๏ซ 2(10) ๏ฝ 60 .
Therefore, the population mean is 60 and the standard deviation is 10.
2.95
The mean score of U.S. eighth-graders on a mathematics assessment test is 282. This is the average score.
The 25th percentile is 258. This means that 25% of the U.S. eighth-graders score below 258 on the test and
75% score higher. The 75th percentile is 308. This means that 75% of the U.S. eighth-graders score below
308 on the test and 25% score higher. The 90th percentile is 329. This means that 90% of the U.S. eighthgraders score below 329 on the test and 10% score higher.
2.96
a.
z๏ฝ
x ๏ญ x 400 ๏ญ 353
๏ฝ
๏ฝ 1.57 A transformer with 400 sags in a week is 1.57 standard deviations above
s
30
the mean.
b.
2.97
2.98
z๏ฝ
x ๏ญ x 100 ๏ญ 184
๏ฝ
๏ฝ ๏ญ3.36 A transformer with 100 swells in a week is 3.36 standard deviations
s
25
below the mean.
A mean current salary of $57,000 indicates that the average current salary of the University of South
Florida graduates is $57,000. At mid-career, half of the University of South Florida graduates had a salary
less than $48,000 and half had salaries greater than $48,000. At mid-career, 90% of the University of
South Florida graduates had salaries under $131,000 and 10% had salaries greater than $131,000.
a.
From Exercise 2.81, x ๏ฝ 94.78 and s ๏ฝ 4.99 . The z-score for an observation of 73 is:
z๏ฝ
x ๏ญ x 73 ๏ญ 94.78
๏ฝ
๏ฝ ๏ญ4.36
s
4.99
This z-score indicates that an observation of 73 is 4.36 standard deviations below the mean. Very
few observations will be lower than this one.
b.
The z-score for an observation of 91 is:
z๏ฝ
x ๏ญ x 91 ๏ญ 94.78
๏ฝ
๏ฝ ๏ญ0.76
s
4.99
This z-score indicates that an observation of 91 is .76 standard deviations below the mean. This score
is not an unusual observation in the data set.
2.99
Since the 90th percentile of the study sample in the subdivision was .00372 mg/L, which is less than the
USEPA level of .015 mg/L, the water customers in the subdivision are not at risk of drinking water with
unhealthy lead levels.
Copyright ยฉ 2018 Pearson Education, Inc.
54
2.100
2.101
Chapter 2
x ๏ญ x 155 ๏ญ 67.755
๏ฝ
๏ฝ 3.25 . This score would not be
s
26.871
considered a typical level of support. It is 3.25 standard deviations above the mean. Very few observations
would be above this value.
The z-score associated with a score of 155 is z ๏ฝ
The average ROE is 13.93. The median ROE is 14.86, meaning 50% of firms have ROE below 14.86. The
5th percentile is -19.64 meaning 5% of firms have ROE below โ19.64. The 25th percentile is 7.59 meaning
25% of firms have ROE below 7.59. The 75th percentile is 21.32 meaning 75% of firms have ROE below
21.32. The 95th percentile is 38.42 meaning 95% of firms have ROE below 38.42. The standard deviation
is 21.65. Most observations will fall within 2s or 43.30 units of mean. The distribution will be somewhat
skewed to the left as the 5th percentile value is much further from the median than the 95th percentile value.
2.102
a.
Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the blue exam, the mean is 53% and the standard deviation is 15%. We know that approximately
68% of all students will score within 1 standard deviation of the mean. This interval is:
x ๏ฑ s ๏ 53 ๏ฑ 15 ๏ (38, 68)
About 95% of all students will score within 2 standard deviations of the mean. This interval is:
x ๏ฑ 2s ๏ 53 ๏ฑ 2(15) ๏ 53 ๏ฑ 30 ๏ (23, 83)
About 99.7% of all students will score within 3 standard deviations of the mean. This interval is:
x ๏ฑ 3s ๏ 53 ๏ฑ 3(15) ๏ 53 ๏ฑ 45 ๏ (8, 98)
b.
Since the data are approximately mound-shaped, we can use the Empirical Rule.
On the red exam, the mean is 39% and the standard deviation is 12%. We know that approximately
68% of all students will score within 1 standard deviation of the mean. This interval is:
x ๏ฑ s ๏ 39 ๏ฑ 12 ๏ (27, 51)
About 95% of all students will score within 2 standard deviations of the mean. This interval is:
x ๏ฑ 2 s ๏ 39 ๏ฑ 2(12) ๏ 39 ๏ฑ 24 ๏ (15, 63)
About 99.7% of all students will score within 3 standard deviations of the mean. This interval is:
x ๏ฑ 3s ๏ 39 ๏ฑ 3(12) ๏ 39 ๏ฑ 36 ๏ (3, 75)
2.103
c.
The student would have been more likely to have taken the red exam. For the blue exam, we know
that approximately 95% of all scores will be from 23% to 83%. The observed 20% score does not
fall in this range. For the red exam, we know that approximately 95% of all scores will be from 15%
to 63%. The observed 20% score does fall in this range. Thus, it is more likely that the student
would have taken the red exam.
a.
The z-score for Harvard is z = 5.08. This means that Harvardโs productivity score was 5.08 standard
deviations above the mean. This is extremely high and extremely unusual.
b.
The z-score for Howard University is z = ๏ญ.85. This means that Howard Universityโs productivity
score was .85 standard deviations below the mean. This is not an unusual z-score.
c.
Yes. Other indicators that the distribution is skewed to the right are the values of the highest and
lowest z-scores. The lowest z-score is less than 1 standard deviation below the mean while the
highest z-score is 5.08 standard deviations above the mean.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
55
Using MINITAB, the histogram of the z-scores is:
Histogram of Z-Score
70
60
Frequency
50
40
30
20
10
0
-1
0
1
2
Z-Score
3
4
5
This histogram does imply that the data are skewed to the right.
2.104
a.
From the problem, ๏ญ ๏ฝ 2.7 and ๏ณ ๏ฝ .5
z๏ฝ
x๏ญ๏ญ
๏ณ
๏ z๏ณ ๏ฝ x ๏ญ ๏ญ ๏ x ๏ฝ ๏ญ ๏ซ z๏ณ
For z = 2.0, x ๏ฝ 2.7 ๏ซ 2.0(.5) ๏ฝ 3.7
For z = ๏ญ1.0, x ๏ฝ 2.7 ๏ญ 1.0(.5) ๏ฝ 2.2
For z = .5, x ๏ฝ 2.7 ๏ซ .5(.5) ๏ฝ 2.95
For z = ๏ญ2.5, x ๏ฝ 2.7 ๏ญ 2.5(.5) ๏ฝ 1.45
b.
For z = ๏ญ1.6, x ๏ฝ 2.7 ๏ญ 1.6(.5) ๏ฝ 1.9
c.
If we assume the distribution of GPAs is approximately mound-shaped, we can use the Empirical
Rule.
From the Empirical Rule, we know that ๏ป.025 or ๏ป2.5% of the students will have GPAs above 3.7
(with z = 2). Thus, the GPA corresponding to summa cum laude (top 2.5%) will be greater than 3.7
(z > 2).
We know that ๏ป.16 or 16% of the students will have GPAs above 3.2 (z = 1). Thus, the limit on
GPAs for cum laude (top 16%) will be greater than 3.2 (z > 1).
Copyright ยฉ 2018 Pearson Education, Inc.
56
Chapter 2
We must assume the distribution is mound-shaped.
2.105
Not necessarily. Because the distribution is highly skewed to the right, the standard deviation is very large.
Remember that the z-score represents the number of standard deviations a score is from the mean. If the
standard deviation is very large, then the z-scores for observations somewhat near the mean will appear to
be fairly small. If we deleted the schools with the very high productivity scores and recomputed the mean
and standard deviation, the standard deviation would be much smaller. Thus, most of the z-scores would be
larger because we would be dividing by a much smaller standard deviation. This would imply a bigger
spread among the rest of the schools than the original distribution with the few outliers.
2.106
To determine if the measurements are outliers, compute the z-score.
a.
b.
c.
d.
2.107
z๏ฝ
x ๏ญ x 65 ๏ญ 57
๏ฝ
๏ฝ .727
s
11
Since the z-score is less than 3, this would not be an outlier.
x ๏ญ x 21 ๏ญ 57
๏ฝ
๏ฝ ๏ญ3.273 Since the z-score is greater than 3 in absolute value, this would be an
s
11
outlier.
z๏ฝ
z๏ฝ
x ๏ญ x 72 ๏ญ 57
๏ฝ
๏ฝ 1.364 Since the z-score is less than 3, this would not be an outlier.
s
11
x ๏ญ x 98 ๏ญ 57
๏ฝ
๏ฝ 3.727 Since the z-score is greater than 3 in absolute value, this would be an
s
11
outlier.
z๏ฝ
The interquartile range is IQR ๏ฝ QU ๏ญ QL ๏ฝ 85 ๏ญ 60 ๏ฝ 25 .
The lower inner fence = QL ๏ญ 1.5( IQR ) ๏ฝ 60 ๏ญ 1.5(25) ๏ฝ 22.5 .
The upper inner fence = QU ๏ซ 1.5( IQR ) ๏ฝ 85 ๏ซ 1.5(25) ๏ฝ 122.5 .
The lower outer fence = QL ๏ญ 3( IQR ) ๏ฝ 60 ๏ญ 3(25) ๏ฝ ๏ญ15 .
The upper outer fence = QU ๏ซ 3( IQR ) ๏ฝ 85 ๏ซ 3(25) ๏ฝ 160 .
With only this information, the box plot would look something like the following:
*
โโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโ
+
โโโโโโโ
โโโโโโโโโโโโ
โโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโโโผโโโ
10
20
30
40
50
60
70
80
90 100 110
The whiskers extend to the inner fences unless no data points are that small or that large. The upper inner
fence is 122.5. However, the largest data point is 100, so the whisker stops at 100. The lower inner fence
is 22.5. The smallest data point is 18, so the whisker extends to 22.5. Since 18 is between the inner and
outer fences, it is designated with a *. We do not know if there is any more than one data point below 22.5,
so we cannot be sure that the box plot is entirely correct.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.108
a.
Median is approximately 4.
b.
QL is approximately 3 (Lower Quartile)
57
QU is approximately 6 (Upper Quartile)
2.109
c.
IQR ๏ฝ QU ๏ญ QL ๏ป 6 ๏ญ 3 ๏ฝ 3
d.
The data set is skewed to the right since the right whisker is longer than the left, there is one outlier,
and there are two potential outliers.
e.
50% of the measurements are to the right of the median and 75% are to the left of the upper quartile.
f.
The upper inner fence is QU ๏ซ 1.5( IQR) ๏ฝ 6 ๏ซ 1.5(3) ๏ฝ 10.5 . The upper outer fence is
QU ๏ซ 3( IQR ) ๏ฝ 6 ๏ซ 3(3) ๏ฝ 15 . Thus, there are two suspect outliers, 12 and 13. There is one highly
suspect outlier, 16.
a.
Using MINITAB, the box plots for samples A and B are:
Boxplot of Sample A, Sample B
Sample A
Sample B
100
125
150
175
200
Data
b.
In sample A, the measurement 84 is an outlier. This measurement falls outside the lower outer fence.
Lower outer fence = Lower hinge ๏ญ3( IQR ) ๏ป 150 ๏ญ 3(172 ๏ญ 150) ๏ฝ 150 ๏ญ 3(22) ๏ฝ 84
Lower inner fence = Lower hinge ๏ญ1.5( IQR) ๏ป 150 ๏ญ 1.5(22) ๏ฝ 117
Upper inner fence = Upper hinge ๏ซ1.5( IQR ) ๏ป 172 ๏ซ 1.5(22) ๏ฝ 205
In addition, 100 may be an outlier. It lies outside the inner fence.
In sample B, 140 and 206 may be outliers. The point 140 lies outside the inner fence while the point
206 lies right at the inner fence.
Lower outer fence = Lower hinge ๏ญ3( IQR ) ๏ป 168 ๏ญ 3(184 ๏ญ 169) ๏ฝ 168 ๏ญ 3(15) ๏ฝ 123
Lower inner fence = Lower hinge ๏ญ1.5( IQR) ๏ป 168 ๏ญ 1.5(15) ๏ฝ 145.5
Upper inner fence = Upper hinge ๏ซ1.5( IQR ) ๏ป 184 ๏ซ 1.5(15) ๏ฝ 206.5
Copyright ยฉ 2018 Pearson Education, Inc.
58
Chapter 2
2.110
a.
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: Acad Rep Score
Variable
Acad Rep Score
N
50
Mean
76.42
Minimum
47.00
Q1
64.75
Median
76.00
Q3
89.00
Maximum
100.00
IQR
24.25
The median is 76, the lower quartile is 64.75, and the upper quartile is 89.
b.
IQR ๏ฝ QU ๏ญ QL ๏ฝ 89 ๏ญ 64.75 ๏ฝ 24.25
c.
Using MINITAB, the boxplot is:
Boxplot of Academic Rep Score
40
50
60
70
80
90
100
Academic Rep Score
d.
Suspect outliers lie between QL ๏ญ 1.5๏จ IQR ๏ฉ and QL ๏ญ 3๏จ IQR ๏ฉ or between QU ๏ซ 1.5 ๏จ IQR ๏ฉ and QU ๏ซ 3๏จ IQR ๏ฉ .
QL ๏ญ 1.5 ๏จ IQR ๏ฉ ๏ฝ 64.75 ๏ญ 1.5 ๏จ 24.25๏ฉ ๏ฝ 28.375 , QL ๏ญ 3๏จ IQR ๏ฉ ๏ฝ 64.75 ๏ญ 3๏จ 24.25๏ฉ ๏ฝ ๏ญ8
QU ๏ซ 1.5 ๏จ IQR ๏ฉ ๏ฝ 89 ๏ซ 1.5 ๏จ 24.25๏ฉ ๏ฝ 125.375 , QU ๏ซ 3๏จ IQR ๏ฉ ๏ฝ 89 ๏ซ 3๏จ 24.25๏ฉ ๏ฝ 161.75
No scores are less than 28.375 nor larger than 125.375. Therefore, there are no outliers or suspect
outliers.
2.111
a.
z๏ฝ
x ๏ญ x 400 ๏ญ 353
๏ฝ
๏ฝ 1.57 Since the z-score is less than 2, 400 sags per week would not be
s
30
considered unusual.
b.
z๏ฝ
x ๏ญ x 100 ๏ญ 184
๏ฝ
๏ฝ ๏ญ3.36 Since the absolute value of the z-score is greater than 3, 100 swells per
s
25
week would be considered unusual.
2.112
2.113
a.
The approximate 25th percentile PASI score before treatment is 10. The approximate median before
treatment is 15. The approximate 75th percentile PASI score before treatment is 28.
b.
The approximate 25th percentile PASI score after treatment is 3. The approximate median after
treatment is 5. The approximate 75th percentile PASI score after treatment is 7.5.
c.
Since the 75th percentile after treatment is lower than the 25th percentile before treatment, it appears
that the ichthyotherapy is effective in treating psoriasis.
a.
The average expenditure per full-time employee is $6,563. The median expenditure per employee is
$6,232. Half of all expenditures per employee were less than $6,232 and half were greater than
$6,232. The lower quartile is $5,309. Twenty-five percent of all expenditures per employee were
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
59
below $5,309. The upper quartile is $7,216. Seventy-five percent of all expenditures per employee
were below $7,216.
2.114
b.
IQR ๏ฝ QU ๏ญ QL ๏ฝ $7, 216 ๏ญ $5,309 ๏ฝ $1,907 .
c.
The interquartile range goes from the 25th percentile to the 75th percentile. Thus, .5 ๏ฝ .75 ๏ญ .25 of the
1,751 army hospitals have expenses between $5,309 and $7,216.
a.
From the printout, x ๏ฝ 52.334 and s = 9.224.
The highest salary is 75 (thousand).
The z-score is z ๏ฝ
x ๏ญ x 75 ๏ญ 52.334
๏ฝ
๏ฝ 2.46
s
9.224
Therefore, the highest salary is 2.46 standard deviations above the mean.
The lowest salary is 35.0 (thousand).
The z-score is z ๏ฝ
x ๏ญ x 35.0 ๏ญ 52.334
๏ฝ
๏ฝ ๏ญ1.88
s
9.224
Therefore, the lowest salary is 1.88 standard deviations below the mean.
The mean salary offer is 52.33 (thousand).
The z-score is z ๏ฝ
x ๏ญ x 52.33 ๏ญ 52.334
๏ฝ
๏ฝ0
s
9.224
The z-score for the mean salary offer is 0 standard deviations from the mean.
No, the highest salary offer is not unusually high. For any distribution, at least 8/9 of the salaries
should have z-scores between ๏ญ3 and 3. A z-score of 2.46 would not be that unusual.
Since no salaries are outside the inner fences, none of them are suspect or highly suspect outliers.
a.
Using MINITAB, the boxplots for each type of firm are:
Boxplot of Time
Joint
Votes
2.115
b.
None
Prepack
0
2
4
6
8
10
Time
b.
The median bankruptcy time for No prefiling firms is about 3.2. The median bankruptcy time for
Copyright ยฉ 2018 Pearson Education, Inc.
60
Chapter 2
Joint firms is about 1.5. The median bankruptcy time for Prepack firms is about 1.4.
2.116
c.
The range of the “Prepack” firms is less than the other two, while the range of the “None” firms is the
largest. The interquartile range of the “Prepack” firms is less than the other two, while the
interquartile range of the “Joint” firms is larger than the other two.
d.
No. The interquartile range for the “Prepack” firms is the smallest which corresponds to the smallest
standard deviation. However, the second smallest interquartile range corresponds to the “None”
firms. The second smallest standard deviation corresponds to the “Joint” firms.
e.
Yes. There is evidence of two outliers in the “Prepack” firms. These are indicated by the two *’s.
There is also evidence of two outliers in the “None” firms. These are indicated by the two *’s.
From Exercise 2.100, x ๏ฝ 67.755 and s ๏ฝ 26.87 . Using MINITAB, a boxplot of the data is:
Boxplot of Support
0
20
40
60
80
100
120
140
160
Support
From the boxplot, the support level of 155 would be an outlier. From Exercise 2.100, we found the z-score
x ๏ญ x 155 ๏ญ 67.755
associated with a score of 155 as z ๏ฝ
๏ฝ
๏ฝ 3.25 . Since this z-score is greater than 3, the
s
26.871
observation 155 is considered an outlier.
2.117
a.
Using MINITAB, the boxplot is:
Boxplot of SCORE
70
75
80
85
90
95
100
SCORE
From the boxplot, there appears to be 4 outliers: 69, 73, 76, and 78.
b.
From Exercise 2.81, x ๏ฝ 94.78 and s ๏ฝ 4.99 . Since the data are skewed to the left, we will consider
observations more than 2 standard deviations from the mean to be outliers. An observation with a zscore of 2 would have the value:
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
z๏ฝ
61
x๏ญx
x ๏ญ 94.78
๏2๏ฝ
๏ 2(4.99) ๏ฝ x ๏ญ 94.78 ๏ 9.98 ๏ฝ x ๏ญ 94.78 ๏ x ๏ฝ 104.76
s
4.99
An observation with a z-score of -2 would have the value:
z๏ฝ
x๏ญx
x ๏ญ 94.78
๏ ๏ญ2 ๏ฝ
๏ ๏ญ2(4.99) ๏ฝ x ๏ญ 94.78 ๏ ๏ญ9.98 ๏ฝ x ๏ญ 94.78 ๏ x ๏ฝ 84.80
s
4.99
Observations greater than 104.76 or less than 84.80 would be considered outliers. Using this
criterion, the following observations would be outliers: 69, 73, 76, and 78.
2.118
c.
Yes, these methods do not agree exactly. Using the boxplot, 4 observations were identified as
outliers. Using the z-score method, 4 observations were also identified as outliers.
a.
Using MINITAB, the box plot is:
Boxplot of Downtime
0
10
20
30
40
50
60
70
Downtime
The median is about 18. The data appear to be skewed to the right since there are 3 suspect outliers
to the right and none to the left. The variability of the data is fairly small because the IQR is fairly
small, approximately 26 ๏ญ 10 = 16.
b.
The customers associated with the suspected outliers are customers 268, 269, and 264.
c.
In order to find the z-scores, we must first find the mean and standard deviation.
x๏ฝ
๏ฅ x ๏ฝ 815 ๏ฝ 20.375
n
40
s2 ๏ฝ
๏ฅ x2 ๏ญ
๏จ๏ฅ x๏ฉ
n ๏ญ1
n
2
s ๏ฝ 192.90705 ๏ฝ 13.89
The z-scores associated with the suspected outliers are:
Customer 268 z ๏ฝ
49 ๏ญ 20.375
๏ฝ 2.06
13.89
Customer 269 z ๏ฝ
50 ๏ญ 20.375
๏ฝ 2.13
13.89
2
24129 ๏ญ 815
40 ๏ฝ 192.90705
๏ฝ
40 ๏ญ 1
Copyright ยฉ 2018 Pearson Education, Inc.
62
Chapter 2
Customer 264 z ๏ฝ
64 ๏ญ 20.375
๏ฝ 3.14
13.89
All the z-scores are greater than 2. These are unusual values.
2.119
Using MINITAB, the boxplots of the data are:
Boxplot of PermA, PermB, PermC
PermA
PermB
PermC
50
75
100
125
150
Data
The descriptive statistics are:
Descriptive Statistics: PermA, PermB, PermC
Variable
PermA
PermB
PermC
2.120
N
100
100
100
Mean
73.62
128.54
83.07
StDev
14.48
21.97
20.05
Minimum
55.20
50.40
52.20
Q1
62.00
108.65
67.72
Median
70.45
139.30
78.65
Q3
81.42
147.02
95.35
Maximum
122.40
150.00
129.00
IQR
19.42
38.37
27.63
a.
For group A, the suspect outliers are any observations greater than
QU ๏ซ 1.5 ๏จ IQR ๏ฉ ๏ฝ 81.42 ๏ซ 1.5 ๏จ19.42๏ฉ ๏ฝ 110.55 or less than QL ๏ญ 1.5 ๏จ IQR ๏ฉ ๏ฝ 62 ๏ญ 1.5 ๏จ19.42๏ฉ ๏ฝ 32.87 . There
are 3 observations greater than 110.55: 117.3, 118.5, and 122.4.
b.
For group B, the suspect outliers are any observations greater than
QU ๏ซ 1.5 ๏จ IQR ๏ฉ ๏ฝ 147.02 ๏ซ 1.5 ๏จ 38.37 ๏ฉ ๏ฝ 204.575 or less than QL ๏ญ 1.5๏จ IQR ๏ฉ ๏ฝ 108.65 ๏ญ 1.5 ๏จ 38.37 ๏ฉ ๏ฝ 51.095 .
There is 1 observation less than 51.095: 50.4.
c.
For group C, the suspect outliers are any observations greater than
QL ๏ซ 1.5 ๏จ IQR ๏ฉ ๏ฝ 95.35 ๏ซ 1.5 ๏จ 27.63๏ฉ ๏ฝ 136.795 or less than QL ๏ญ 1.5 ๏จ IQR ๏ฉ ๏ฝ 67.72 ๏ญ 1.5 ๏จ 27.63๏ฉ ๏ฝ 26.275 .
No observations are greater than 136.795 or less than 26.275.
d.
For group A, if the outliers are removed, the mean will decrease, the median will slightly decrease,
and the standard deviation will decrease. For group B, if the outlier is removed, the mean will
increase, the median will slightly increase, and the standard deviation will decrease.
For Perturbed Intrinsics, but no Perturbed Projections:
2
n
๏ฅx
x ๏ฝ i ๏ฝ1
n
i
๏ฝ
8.1
๏ฝ 1.62
5
๏ฆ n ๏ถ
๏ง ๏ฅ xi ๏ท
n
๏จ i ๏ฝ1 ๏ธ
8.12
2
x
๏ญ
15.63 ๏ญ
๏ฅ
i
n
5 ๏ฝ 2.508 ๏ฝ .627
s 2 ๏ฝ i ๏ฝ1
๏ฝ
n ๏ญ1
5 ๏ญ1
4
The z-score corresponding to a value of 4.5 is z ๏ฝ
x ๏ญ x 4.5 ๏ญ 1.62
๏ฝ
๏ฝ 3.63
s
.792
Copyright ยฉ 2018 Pearson Education, Inc.
s ๏ฝ s 2 ๏ฝ .627 ๏ฝ .792
Methods for Describing Sets of Data
63
Since this z-score is greater than 3, we would consider this an outlier for perturbed intrinsics, but no
perturbed projections.
For Perturbed Projections, but no Perturbed Intrinsics:
2
n
๏ฅx
x ๏ฝ i ๏ฝ1
n
i
๏ฝ
125.8
๏ฝ 25.16
5
๏ฆ n ๏ถ
๏ง ๏ฅ xi ๏ท
n
๏จ i ๏ฝ1 ๏ธ
125.82
2
๏ญ
x
3350.1 ๏ญ
๏ฅ
i
n
5 ๏ฝ 184.972 ๏ฝ 46.243
s 2 ๏ฝ i ๏ฝ1
๏ฝ
n ๏ญ1
5 ๏ญ1
4
s ๏ฝ s 2 ๏ฝ 46.243 ๏ฝ 6.800
The z-score corresponding to a value of 4.5 is z ๏ฝ
x ๏ญ x 4.5 ๏ญ 25.16
๏ฝ
๏ฝ ๏ญ3.038
s
6.800
Since this z-score is less than -3, we would consider this an outlier for perturbed projections, but no
perturbed intrinsics.
Since the z-score corresponding to 4.5 for the perturbed projections, but no perturbed intrinsics is smaller in
absolute value than that for perturbed intrinsics, but no perturbed projections, it is more likely that the that
the type of camera perturbation is perturbed projections, but no perturbed intrinsics.
2.121
From the stem-and-leaf display in Exercise 2.34, the data are fairly mound-shaped, but skewed somewhat
to the right.
The sample mean is x ๏ฝ
๏ฅ x ๏ฝ 1493 ๏ฝ 59.72 .
n
The sample variance is s 2 ๏ฝ
25
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
14932
25 ๏ฝ 321.7933 .
25 ๏ญ 1
96,885 ๏ญ
The sample standard deviation is s ๏ฝ 321.7933 ๏ฝ 17.9386 .
The z-score associated with the largest value is z ๏ฝ
x ๏ญ x 102 ๏ญ 59.72
๏ฝ
๏ฝ 2.36 .
s
17.9386
This observation is a suspect outlier.
The observations associated with the one-time customers are 5 of the largest 7 observations. Thus, repeat
customers tend to have shorter delivery times than one-time customers.
Copyright ยฉ 2018 Pearson Education, Inc.
64
2.122
Chapter 2
Using MINITAB, a scatterplot of the data is:
Scatterplot of Var 2 vs Var 1
14
12
10
Var 2
8
6
4
2
0
0
2
4
6
8
Var 1
2.123
Using MINITAB, the scatterplot is:
Scatterplot of Var 2 vs Var 1
18
16
14
Var 2
12
10
8
6
4
2
0
1
2
3
4
5
Var 1
Using MINITAB, the scatterplot is:
Scatterplot of RATIO vs DIAMETER
10.0
9.5
9.0
RATIO
2.124
8.5
8.0
7.5
7.0
6.5
0
100
200
300
400
500
600
700
DIAMETER
It appears that as the pipe diameter increases, the ratio of repair to replacement cost increases.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
65
2.125.
From the scatterplot of the data, it appears that as the number of punishments increases, the average payoff
decreases. Thus, there appears to be a negative linear relationship between punishment use and average
payoff. This supports the researchers conclusion that โwinnersโ donโt punishโ.
2.126
Using MINITAB, the scatterplot of the data is:
Scatterplot of Catch vs Search
7000
Catch
6000
5000
4000
3000
15
20
25
30
35
Search
There is an apparent negative linear trend between the search frequency and the total catch. As the search
frequency increases, the total catch tends to decrease.
Using MINITAB, a scattergram of the data is:
Scatterplot of SLUGPCT vs ELEVATION
0.625
0.600
0.575
SLUGPCT
2.127
0.550
0.525
0.500
0.475
0.450
0
1000
2000
3000
4000
5000
6000
ELEVATION
If we include the observation from Denver, then we would say there might be a linear relationship between
slugging percentage and elevation. If we eliminated the observation from Denver, it appears that there
might not be a relationship between slugging percentage and elevation.
Copyright ยฉ 2018 Pearson Education, Inc.
66
2.128
Chapter 2
Using MINITAB, the scatterplot of the data is:
Scatterplot of MATH2014 vs MATH2010
625
600
MATH2014
575
550
525
500
475
450
450
475
500
525
550
575
600
625
MATH2010
There appears to be a positive linear trend between the Math SAT scores in 2010 and the Math SAT scores
in 2014. As the 2010 Math SAT scores increase, the 2014 Math SAT scores also tend to increase.
2.129
Using MINITAB, the scatterplot of the data is:
Scatterplot of Number vs Hour
400
350
Number
300
250
200
150
100
0
2
4
6
8
10
12
Hour
There appears to be a positive linear trend to the data. As the hours increase, the number of accidents tends
to increase.
Using MINITAB, the scatterplot of the data is:
Scatterplot of Mass vs Time
7
6
5
4
Mass
2.130
3
2
1
0
0
10
20
30
40
50
60
Time
There is evidence to indicate that the mass of the spill tends to diminish as time
increases. As time is getting larger, the mass is decreasing.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.131
a.
67
Using MINITAB, a scatterplot of the data is:
Scatterplot of Year2 vs Year1
55
50
Year2
45
40
35
30
20
30
40
50
60
Year1
There is a moderate positive trend to the data. As the scores for Year1 increase, the scores for Year2
also tend to increase.
b.
2.132
From the graph, two agencies that had greater than expected PARS evaluation scores for Year2 were
USAID and State.
Using MINITAB, the scatterplot of the data is:
Scatterplot of VALUE vs OPINCOME
4000
3500
VALUE
3000
2500
2000
1500
0
50
100
150
200
250
300
OPINCOME
There is a positive trend to the data. As operating income increases, the 2015value also tends to increase.
Since the trend is positive, we would recommend that an NFL executive use operating income to predict a
teamโs current value.
a.
Using MINITAB, the scatterplot of the data is:
Scatterplot of Ratio vs Age
2000
1750
1500
Ratio
2.133
1250
1000
750
500
45
50
55
60
65
70
75
80
Age
There appears to be a weak, negative relationship between a CEOโs ratio of salary to worker pay and
the CEOโs age.
Copyright ยฉ 2018 Pearson Education, Inc.
68
Chapter 2
b.
Using MINITAB the descriptive statistics are:
Descriptive Statistics: Ratio
Variable
Ratio
N
40
Mean
641.8
StDev
314.8
Minimum
415.0
Q1
481.0
Median
536.5
Q3
660.8
Maximum
1951.0
IQR
179.8
Using the interquartile range, the highly suspect outliers are any observations greater than
QU ๏ซ 3๏จ IQR ๏ฉ ๏ฝ 660.8 ๏ซ 3๏จ179.8๏ฉ ๏ฝ 1,200.2 or less than QL ๏ญ 3๏จ IQR ๏ฉ ๏ฝ 481.0 ๏ญ 3๏จ179.8๏ฉ ๏ฝ ๏ญ58.4 . There are
2 highly suspect outliers: 1,522 and 1,951.
Using the z-score, any observation greater than 3 standard deviations above or below the mean are
highly suspect outliers. Three standard deviations above the mean is:
x๏ญx
x ๏ญ 641.8
z๏ฝ
๏3๏ฝ
๏ 3(314.8) ๏ฝ x ๏ญ 641.8 ๏ 944.4 ๏ฝ x ๏ญ 641.8 ๏ x ๏ฝ 1,586.2
314.8
s
Three standard deviations below the mean is:
x๏ญx
x ๏ญ 641.8
z๏ฝ
๏ ๏ญ3 ๏ฝ
๏ ๏ญ3(314.8) ๏ฝ x ๏ญ 641.8 ๏ ๏ญ944.4 ๏ฝ x ๏ญ 641.8 ๏ x ๏ฝ ๏ญ302.6
314.8
s
Using this method, there is one highly suspect outlier: 1,951.
c.
Removing the observation 1,951, the scatterplot of the data is:
Scatterplot of Ratio-Remove1 vs Age-Remove1
1500
Ratio-Remove1
1250
1000
750
500
45
50
55
60
65
70
75
80
Age-Remove1
By removing the one highly suspect outlier, the relationship is still negative, but it is a stronger,
negative relationship.
Using MINITAB, a scatterplot of the data is:
Scatterplot of ACCURACY vs DISTANCE
75
70
65
ACCURACY
2.134
60
55
50
45
280
290
300
310
320
DISTANCE
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
69
Yes, his concern is a valid one. From the scatterplot, there appears to be a fairly strong negative
relationship between accuracy and driving distance. As driving distance increases, the driving accuracy
tend to decrease.
2.135
One way the bar graph can mislead the viewer is that the vertical axis has been cut off. Instead of starting
at 0, the vertical axis starts at 12. Another way the bar graph can mislead the viewer is that as the bars get
taller, the widths of the bars also increase.
2.136
a.
If you work for Volkswagon, you would choose to use the median number of deaths because this is
much lower than the mean. The data are skewed to the right, so the median would probably be a
better representation of the middle of the distribution.
b.
If you support an environmental watch group, you would choose to use the mean number of deaths
because this is much greater than the median. The average number of deaths is much high than the
median number of deaths.
a.
The graph might be misleading because the scales on the vertical axes are different. The left vertical
axis ranges from 0 to $120 million. The right vertical axis ranges from 0 to $20 billion.
b.
Using MINITAB, the redrawn graph is:
2.137
Time Series Plot of Craigslist, NewspaperAds
18000
Variable
Craigslist
NewspaperAds
16000
14000
Data
12000
10000
8000
6000
4000
2000
0
2003
2004
2005
2006
2007
2008
2009
Year
Although the amount of revenue produced by Craigslist has increased dramatically from 2003 to
2009, it is still much smaller than the revenue produced by newspaper ad sales.
2.138
a.
This graph is misleading because it looks like as the days are increasing, the number of barrels
collected per day are also increasing. However, the bars are the cumulative number of barrels
collected. The cumulative value can never decrease.
Copyright ยฉ 2018 Pearson Education, Inc.
70
Chapter 2
b.
Using MINITAB, the graph of the daily collection of oil is:
Chart of Barrells
2500
Barrells
2000
1500
1000
500
0
May-16
May-17
May-18
May-19 May-20
Day
May-21
May-22
May-23
From this graph, it shows that there has not been a steady improvement in the suctioning process.
There was an increase for 3 days, then a leveling off for 3 days, then a decrease.
2.139
The relative frequency histogram is:
Histogram of Class
Relative frequency
.20
.15
.10
.05
0
1.125
2.625
4.125
5.625
Measurement Class
7.125
8.625
2.140
The mean is sensitive to extreme values in a data set. Therefore, the median is preferred to the mean when
a data set is skewed in one direction or the other.
2.141
a.
z๏ฝ
b.
z๏ฝ
c
z๏ฝ
d.
z๏ฝ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
x๏ญ๏ญ
๏ณ
๏ฝ
50 ๏ญ 60
๏ฝ ๏ญ1
10
z๏ฝ
70 ๏ญ 60
๏ฝ1
10
z๏ฝ
80 ๏ญ 60
๏ฝ2
10
๏ฝ
50 ๏ญ 50
๏ฝ0
5
z๏ฝ
70 ๏ญ 50
๏ฝ4
5
z๏ฝ
80 ๏ญ 50
๏ฝ6
5
๏ฝ
50 ๏ญ 40
๏ฝ1
10
z๏ฝ
70 ๏ญ 40
๏ฝ3
10
z๏ฝ
80 ๏ญ 40
๏ฝ4
10
๏ฝ
50 ๏ญ 40
๏ฝ .1
100
z๏ฝ
70 ๏ญ 40
๏ฝ .3
100
z๏ฝ
80 ๏ญ 40
๏ฝ .4
100
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.142
2.143
a.
If we assume that the data are about mound-shaped, then any observation with a z-score greater than
3 in absolute value would be considered an outlier. From Exercise 2.139, the z-score corresponding
to 50 is ๏ญ1, the z-score corresponding to 70 is 1, and the z-score corresponding to 80 is 2. Since none
of these z-scores is greater than 3 in absolute value, none would be considered outliers.
b.
From Exercise 2.139, the z-score corresponding to 50 is ๏ญ2, the z-score corresponding to 70 is 2, and
the z-score corresponding to 80 is 4. Since the z-score corresponding to 80 is greater than 3, 80
would be considered an outlier.
c.
From Exercise 2.139, the z-score corresponding to 50 is 1, the z-score corresponding to 70 is 3, and
the z-score corresponding to 80 is 4. Since the z-scores corresponding to 70 and 80 are greater than
or equal to 3, 70 and 80 would be considered outliers.
d.
From Exercise 2.139, the z-score corresponding to 50 is .1, the z-score corresponding to 70 is .3, and
the z-score corresponding to 80 is .4. Since none of these z-scores is greater than 3 in absolute value,
none would be considered outliers.
a.
๏ฅ x ๏ฝ 13 ๏ซ 1 ๏ซ 10 ๏ซ 3 ๏ซ 3 ๏ฝ 30
x๏ฝ
b.
c.
๏ฅx
n
s2 ๏ฝ
5
๏ฝ
๏ฅ
2
๏จ๏ฅ x๏ฉ
x ๏ญ
n
7
๏ฅx
n
๏ฝ
12
๏ฝ3
4
2
๏ฅ
s2 ๏ฝ
x๏ฝ
n
๏ฝ
34
๏ฝ 5.67
6
2
n
2
2
2
302
5 ๏ฝ 108 ๏ฝ 27
5 ๏ญ1
4
288 ๏ญ
๏ฝ
2
๏จ๏ฅ x๏ฉ
x ๏ญ
s2 ๏ฝ
๏ฅ
2
s ๏ฝ 27 ๏ฝ 5.20
2
2
252
4 ๏ฝ 84.75 ๏ฝ 28.25
4 ๏ญ1
3
241 ๏ญ
2
n ๏ญ1
n
๏ฝ
s ๏ฝ 28.25 ๏ฝ 5.32
๏ฅ x ๏ฝ 1 ๏ซ 0 ๏ซ 1 ๏ซ 10 ๏ซ 11 ๏ซ 11 ๏ซ 15 ๏ฝ 569 .
2
๏จ๏ฅ x๏ฉ
x ๏ญ
2
n ๏ญ1
n
2
2
2
2
2
492
7 ๏ฝ 226 ๏ฝ 37.67
7 ๏ญ1
6
569 ๏ญ
2
๏ฝ
2
2
s ๏ฝ 37.67 ๏ฝ 6.14
๏ฅ x ๏ฝ 3 ๏ซ 3 ๏ซ 3 ๏ซ 3 ๏ฝ 36
2
s2 ๏ฝ
๏ฅ
s2 ๏ฝ
2
2
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
2
n ๏ญ1
๏ฅ x ๏ฝ 4 ๏ซ 6 ๏ซ 6 ๏ซ 5 ๏ซ 6 ๏ซ 7 ๏ฝ 34
๏ฅx
2
๏ฅ x ๏ฝ 13 ๏ซ 6 ๏ซ 6 ๏ซ 0 ๏ฝ 241
25
๏ฝ 6.25
4
๏ฅ x ๏ฝ 49 ๏ฝ 7
2
2
n ๏ญ1
๏ฅ x ๏ฝ 3 ๏ซ 3 ๏ซ 3 ๏ซ 3 ๏ฝ 12
x๏ฝ
a.
n
2
๏ฅ x ๏ฝ 1 ๏ซ 0 ๏ซ 1 ๏ซ 10 ๏ซ 11 ๏ซ 11 ๏ซ 15 ๏ฝ 49
x๏ฝ
d.
๏ฅ x ๏ฝ 30 ๏ฝ 6
๏ฅ x ๏ฝ 13 ๏ซ 1 ๏ซ 10 ๏ซ 3 ๏ซ 3 ๏ฝ 288
๏ฅ x ๏ฝ 13 ๏ซ 6 ๏ซ 6 ๏ซ 0 ๏ฝ 25
x๏ฝ
2.144
71
n
๏ฝ
2
122
4 ๏ฝ 0 ๏ฝ0
4 ๏ญ1
3
36 ๏ญ
s๏ฝ 0 ๏ฝ0
๏ฅ x ๏ฝ 4 ๏ซ 6 ๏ซ 6 ๏ซ 5 ๏ซ 6 ๏ซ 7 ๏ฝ 198
๏ฅ
2
2
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
2
n ๏ญ1
n
๏ฝ
2
2
2
2
342
6 ๏ฝ 5.3333 ๏ฝ 1.0667
6 ๏ญ1
5
198 ๏ญ
Copyright ยฉ 2018 Pearson Education, Inc.
s ๏ฝ 1.067 ๏ฝ 1.03
72
Chapter 2
๏ฅ x ๏ฝ ๏ญ1 ๏ซ 4 ๏ซ (๏ญ3) ๏ซ 0 ๏ซ (๏ญ3) ๏ซ (๏ญ6) ๏ฝ ๏ญ9
b.
x๏ฝ
๏ฅ x ๏ฝ ๏ญ9 ๏ฝ ๏ญ$1.5
n
s2 ๏ฝ
6
๏ฅ
๏ฅ x ๏ฝ (๏ญ1) ๏ซ 4 ๏ซ (๏ญ3) ๏ซ 0 ๏ซ (๏ญ3) ๏ซ (๏ญ6) ๏ฝ 71
2
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
2
n ๏ญ1
n
๏ฝ
2
2
2
2
2
(๏ญ9) 2
6 ๏ฝ 57.5 ๏ฝ 11.5 dollars squared
6 ๏ญ1
5
71 ๏ญ
s ๏ฝ 11.5 ๏ฝ $3.39
๏ฅx ๏ฝ
c.
x๏ฝ
s2 ๏ฝ
3 4 2 1 1
๏ซ ๏ซ ๏ซ ๏ซ ๏ฝ 2.0625
5 5 5 5 16
2
2
2
2
2
2
๏ฆ3๏ถ ๏ฆ 4๏ถ ๏ฆ 2๏ถ ๏ฆ1๏ถ ๏ฆ 1 ๏ถ
๏ฅ x ๏ฝ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 5 ๏ท๏ธ ๏ซ ๏ง๏จ 16 ๏ท๏ธ ๏ฝ 1.2039
๏ฅ x ๏ฝ 2.0625 ๏ฝ .4125%
n
5
๏ฅ
๏จ๏ฅ x๏ฉ
x ๏ญ
2
2
n ๏ญ1
n
๏ฝ
2.06252
.3531
5
๏ฝ
๏ฝ .0883% squared
5 ๏ญ1
4
1.2039 ๏ญ
s ๏ฝ .0883 ๏ฝ .30%
d.
(a)
Range = 7 ๏ญ 4 = 3
(b)
Range = $4 ๏ญ ($-6) = $10
(c)
Range =
4
1
64
5
59
%๏ญ % ๏ฝ
%๏ญ % ๏ฝ
% ๏ฝ .7375%
5
16
80
80
80
2.145
The range is found by taking the largest measurement in the data set and subtracting the smallest
measurement. Therefore, it only uses two measurements from the whole data set. The standard deviation
uses every measurement in the data set. Therefore, it takes every measurement into accountโnot just two.
The range is affected by extreme values more than the standard deviation.
2.146
๏ณ๏ป
2.147
Using MINITAB, the scatterplot is:
range 20
๏ฝ
๏ฝ5
4
4
Scatterplot of Var 2 vs Var 1
30
Var 2
25
20
15
10
100
200
300
400
500
Var 1
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
2.148
2.149
73
x ๏ญ x 30 ๏ญ 39
๏ฝ
๏ฝ ๏ญ1.5 . A score of 30 is 1.5 standard deviations below the mean.
s
6
a.
The z-score is z ๏ฝ
b.
Since the data are mound-shaped and symmetric and 39 is the mean, .5 of the sampled drug dealers
will have WR scores below 39.
c.
If 5% of the drug dealers have WR scores above 49, then 95% will have WR scores below 49. Thus,
49 will be the 95th percentile.
a.
Using MINITAB, the pie chart is:
Pie Chart of Blog/Forum
Category
Company
Employees
Third Party
Not Identified
Not Identified
15.4%
Company
38.5%
Third Party
11.5%
Employees
34.6%
Companies and Employees represent (38.5 + 34.6 = 73.1) slightly more than 73% of the entities
creating blogs/forums. Third parties are the least common entity.
b.
Using Chebyshevโs Rule, at least 75% of the observations will fall within 2 standard deviations of the
mean.
x ๏ฑ 2 s ๏ 4.25 ๏ฑ 2(12.02) ๏ 4.25 ๏ฑ 24.04 ๏ ( ๏ญ19.79, 28.29) or (0, 28.29) since we cannot have a
negative number blogs.
2.150
c.
We would expect the distribution to be skewed to the right. We know that we cannot have a negative
number of blogs/forums. Even 1 standard deviation below the mean is a negative number. We would
assume that there are a few very large observations because the standard deviation is so big compared
to the mean.
a.
To find relative frequencies, we divide the frequencies of each category by the total number of
incidents. The relative frequencies of the number of incidents for each of the cause categories are:
Management System
Cause Category
Engineering & Design
Procedures & Practices
Management & Oversight
Training & Communication
TOTAL
Number of Incidents
Relative Frequencies
27
24
22
10
83
27 / 83 = .325
24 / 83 = .289
22 / 83 = .265
10 / 83 = .120
1
Copyright ยฉ 2018 Pearson Education, Inc.
74
Chapter 2
b.
The Pareto diagram is:
Management Systen Cause Category
35
30
P er cent
25
20
15
10
5
0
2.151
E ng&D es
P roc&P ract
M gmt&O v er
C ategor y
Trn&C omm
c.
The category with the highest relative frequency of incidents is Engineering and Design. The
category with the lowest relative frequency of incidents is Training and Communication.
a.
The relative frequency for each response category is found by dividing the frequency by the total
sample size. The relative frequency for the category โGlobal Marketingโ is 235/2863 = .082. The
rest of the relative frequencies are found in a similar manner and are reported in the table.
Area
Global Marketing
Sales Management
Buyer Behavior
Relationships
Innovation
Marketing Strategy
Channels/Distribution
Marketing Research
Services
TOTAL
Number
235
494
478
498
398
280
213
131
136
2,863
Relative Frequencies
235/2863 = .082
494/2863 = .173
478/2863 = .167
498/2863 = .174
398/2863 = .139
280/2863 = .098
213/2863 = .074
131/2863 = .046
136/2863 = .048
1.00
Relationships and sales management had the most articles published with 17.4% and 17.3%,
respectively. Not far behind was Buyer Behavior with 16.7%. Of the rest of the areas, only
innovation had more than 10%.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
b.
75
Using MINITAB, the pie chart of the data is:
Pie Chart of Area
Services
Marketing research 4.8%
4.6%
Global Marketing
8.2%
Channells/Distribution
7.4%
Sales Management
17.3%
Marketing Strategy
9.8%
Inovation
13.9%
Category
Global Marketing
Sales Management
Buyer Behavior
Relationships
Inovation
Marketing Strategy
Channells/Distribution
Marketing research
Services
Buyer Behavior
16.7%
Relationships
17.4%
The slice for Marketing Research is smaller than the slice for Sales Management because there were
fewer articles on Marketing Research than for Sales Management.
2.152
a.
The data are time series data because the numbers of bankruptcies were collected over a
period of 8 quarters.
b.
Using MINITAB, the time series plot is:
Time Series Plot of Bankruptcies
14000
12000
Bankruptcies
10000
8000
6000
4000
2000
0
0
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Quarter
2.153
c.
There is a generally decreasing trend in the number of bankruptcies as the quarters increase.
a.
Using MINITAB, the pie chart is:
Pie Chart of Drivstar
2
4.1%
5
18.4%
3
17.3%
Category
2
3
4
5
4
60.2%
b.
The average driverโs severity of head injury in head-on collisions is 603.7.
Copyright ยฉ 2018 Pearson Education, Inc.
76
Chapter 2
c.
Since the mean and median are close in value, the data should be fairly symmetric. Thus, we can use
the Empirical Rule. We know that about 95% of all observations will fall within 2 standard
deviations of the mean. This interval is x ๏ฑ 2s ๏ 603.7 ๏ฑ 2(185.4) ๏ 603.7 ๏ฑ 370.8 ๏ (232.9, 974.5)
Most of the head-injury ratings will fall between 232.9 and 974.5.
d.
2.154
x ๏ญ x 408 ๏ญ 603.7
๏ฝ
๏ฝ ๏ญ1.06
s
185.4
Since the absolute value is not very big, this is not an unusual value to observe.
The z-score would be: z ๏ฝ
a.
The data collection method was a survey.
b.
Since the data were 4 different categories, the variable is qualitative.
c.
Using MINITAB, a pie chart of the data is:
Pie Chart of Made in USA
<50%
3.8%
Category
75-99%
100%
50-74%
<50%
75-99%
18.9%
50-74%
17.0%
100%
60.4%
About 60% of those surveyed believe that โMade in USAโ means 100% US labor and
materials.
a.
Using MINITAB, a Pareto diagram for the data is:
Chart Defects
70
60
50
Frequency
2.155
40
30
20
10
0
Body
Accessories
Electrical
Defect
Transmission
Engine
The most frequently observed defect is a body defect.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
b.
77
Using MINITAB, a Pareto diagram for the Body Defect data is:
Chart of Body Defects
30
Frequency
25
20
15
10
5
0
2.156
2.157
Paint
Dents
Upolstery
Body Defect
Windshield
Chrome
Most body defects are either paint or dents. These two categories account for
๏จ 30 ๏ซ 25 ๏ฉ / 70 ๏ฝ 55 / 70 ๏ฝ .786 of all body defects. Since these two categories account for so much of
the body defects, it would seem appropriate to target these two types of body defects for special
attention.
The percentile ranking of the age of 25 years would be 100% ๏ญ 80% = 20%. Thus, an age of 25 would
correspond to the 20th percentile.
a.
The mean amount exported on the printout is 653. This means that the average amount of money per
market from exporting sparkling wine was $653,000.
b.
The median amount exported on the printout is 231. Since the median is the middle value, this means
that half of the 30 sparkling wine export values were above $231,000 and half of the sparkling wine
export values were below $231,000.
c.
The mean 3-year percentage change on the printout is 481. This means that in the last three years, the
average change is 481%, which indicates a large increase.
d.
The median 3-year percentage change on the printout is 156. Since the median is the middle value,
this means that half, or 15 of the 30 countriesโ 3-year percentage change values were above 156% and
half, or 15 of the 30 countriesโ 3-year percentage change values were below 156%.
e.
The range is the difference between the largest observation and the smallest observation. From the
printout, the largest observation is $4,852 thousand and the smallest observation is $70 thousand. The
range is:
R ๏ฝ $4,852 ๏ญ $70 ๏ฝ $4,882 thousand
f.
From the printout, the standard deviation is s = $1,113 thousand.
g.
The variance is the standard deviation squared. The variance is:
s 2 ๏ฝ 1,1132 ๏ฝ 1, 238, 769 million dollars squared
h.
We would expect an export amount to fall within 2 standard deviations of the mean or
x ๏ฑ 2s ๏ 653 ๏ฑ 2 ๏จ1,113๏ฉ ๏ 653 ๏ฑ 2, 226 ๏ ๏จ ๏ญ1,573, 2,879๏ฉ . Since the exports cannot be negative, the
interval would be ๏จ 0, 2,879๏ฉ .
Copyright ยฉ 2018 Pearson Education, Inc.
78
Chapter 2
2.158
a.
Using MINITAB, the pie charts are:
Pie Chart of COLOR, CLARITY
COLOR
I
40, 13.0%
CLARITY
D
16, 5.2%
E
44, 14.3%
Category
D
E
F
G
H
I
IF
VS1
VS2
VVS1
VVS2
IF
44, 14.3%
VVS2
78, 25.3%
H
61, 19.8%
VS1
81, 26.3%
F
VVS1
82, 26.6%
52, 16.9%
G
65, 21.1%
VS2
53, 17.2%
The F color occurs the most often with 26.6%. The clarity that occurs the most is VS1 with 26.3%.
The D color occurs the least often with 5.2%. The clarity that occurs the least is IF with 14.3%.
b.
Using MINITAB, the relative frequency histogram is:
Histogram of CARAT
60
Frequency
50
40
30
20
10
0
0.30
0.45
0.60
0.75
0.90
1.05
CARAT
Using MINITAB, the relative frequency histogram for the GIA group is:
Histogram of CARAT
CERT = GIA
30
20
Percent
c.
10
0
0.30
0.45
0.60
0.75
0.90
1.05
CARAT
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
d.
Using MINITAB, the relative frequency histograms for the HRD and IGI
groups are:
Histogram of CARAT
Histogram of CARAT
CERT = HRD
CERT = IGI
40
40
30
Percent
Percent
30
20
10
0
20
10
0.30
0.45
0.60
0.75
0.90
1.05
0
0.30
CARAT
e.
0.45
0.60
0.75
0.90
The HRD group does not assess any diamonds less than .5 carats and almost 40% of the diamonds
they assess are 1.0 carat or higher. The IGI group does not assess very many diamonds over .5 carats
and more than half are .3 carats or less. More than half of the diamonds assessed by the GIA group
are more than .5 carats, but the sizes are less than those of the HRD group.
๏ฅx
The sample mean is: x ๏ฝ i ๏ฝ1
n
i
๏ฝ
194.32
๏ฝ .631
308
The average number of carats for the 308 diamonds is .631.
g.
1.05
CARAT
n
f.
79
The median is the average of the middle two observations once they have been ordered.
The 154th and 155th observations are .62 and .62. The average of these two observations is .62.
Half of the diamonds weigh less than .62 carats and half weigh more.
h
The mode is 1.0. This observation occurred 32 times.
i.
Since the mean and median are close in value, either could be a good descriptor of central
tendency.
j.
From Chebyshevโs Theorem, we know that at least ยพ or 75% of all observations will fall within 2
standard deviations of the mean. From part e, x ๏ฝ .631 .
2
๏ฆ
๏ถ
๏ง ๏ฅ xi ๏ท
194.322
xi2 ๏ญ ๏จ i ๏ธ
146.19 ๏ญ
๏ฅ
n
308 ๏ฝ .0768 square carats
๏ฝ
The variance is: s 2 ๏ฝ i
n ๏ญ1
308 ๏ญ 1
The standard deviation is: s ๏ฝ s 2 ๏ฝ .0768 ๏ฝ .277 carats
This interval is: x ๏ฑ 2 s ๏ .631 ๏ฑ 2(.277) ๏ .631 ๏ฑ .554 ๏ (.077, 1.185)
Copyright ยฉ 2018 Pearson Education, Inc.
80
Chapter 2
k.
Using MINITAB, the scatterplot is:
Scatterplot of PRICE vs CARAT
18000
16000
14000
PRICE
12000
10000
8000
6000
4000
2000
0
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
CARAT
As the number of carats increases the price of the diamond tends to increase. There appears to be an
upward trend.
a.
Using MINITAB, a bar graph of the data is:
Chart of Cause
12
10
8
Count
2.159
6
4
2
0
Collision
Fire
Grounding
Cause
HullFail
Unknown
Fire and grounding are the two most likely causes of puncture.
b.
Using MINITAB, the descriptive statistics are:
Descriptive Statistics: Spillage
Variable N Mean StDev Minimum Q1 Median Q3 Maximum
Spillage 42 66.19 56.05 25.00 32.00 43.00 77.50 257.00
The mean spillage amount is 66.19 thousand metric tons, while the median is 43.00. Since the
median is so much smaller than the mean, it indicates that the data are skewed to the right. The
standard deviation is 56.05. Again, since this value is so close to the value of the mean, it indicates
that the data are skewed to the right.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
Since the data are skewed to the right, we cannot use the Empirical Rule to describe the data.
Chebyshevโs Rule can be used. Using Chebyshevโs Rule, we know that at least 8/9 of the
observations will fall within 3 standard deviations of the mean.
x ๏ฑ 3s ๏ 66.19 ๏ฑ 3(56.05) ๏ 66.19 ๏ฑ 168.15 ๏ (๏ญ101.96, 234.34) or (0, 234.34) since we cannot
have negative spillage.
Thus, at least 8/9 of all oil spills will be between 0 and 234.34 thousand metric tons.
2.160
Using MINITAB, a pie chart of the data is:
Pie Chart of Recoded defect
Category
False
True
True
49, 9.8%
False
449, 90.2%
A response of โtrueโ means the software contained defective code. Thus, only 9.8% of the modules
contained defective software code.
2.161
a.
Since no information is given about the distribution of the velocities of the Winchester bullets, we
can only use Chebyshev's Rule to describe the data. We know that at least 3/4 of the velocities will
fall within the interval:
x ๏ฑ 2 s ๏ 936 ๏ฑ 2(10) ๏ 936 ๏ฑ 20 ๏ (916, 956)
Also, at least 8/9 of the velocities will fall within the interval:
x ๏ฑ 3s ๏ 936 ๏ฑ 3(10) ๏ 936 ๏ฑ 30 ๏ (906, 966)
b.
Since a velocity of 1,000 is much larger than the largest value in the second interval in part a, it is
very unlikely that the bullet was manufactured by Winchester.
Copyright ยฉ 2018 Pearson Education, Inc.
81
82
Chapter 2
2.162
a.
First, we must compute the total processing times by adding the processing times of the three
departments. The total processing times are as follows:
Request
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Total
Processing
Time
13.3
5.7
7.6
20.0*
6.1
1.8
13.5
13.0
15.6
10.9
8.7
14.9
3.4
13.6
14.6
14.4
Request
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Total
Processing
Time
19.4*
4.7
9.4
30.2
14.9
10.7
36.2*
6.5
10.4
3.3
8.0
6.9
17.2*
10.2
16.0
11.5
Request
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Total
Processing
Time
23.4*
14.2
14.3
24.0*
6.1
7.4
17.7*
15.4
16.4
9.5
8.1
18.2*
15.3
13.9
19.9*
15.4
14.3*
19.0
The stem-and-leaf displays with the appropriate leaves highlighted are as follows:
Stem-and-leaf of Mkt
Leaf Unit = 0.10
6 0
7 1
14 2
16 3
22 4
(10) 5
18 6
8 7
4 8
2 9
2 10
1 11
0112446
3
0024699
25
001577
0344556889
0002224799
0038
07
0
0
Stem-and-leaf of Engr
Leaf Unit = 0.10
7
14
19
23
(5)
22
19
14
9
9
7
6
5
2
1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
4466699
3333788
12246
1568
24688
233
01239
22379
66
0
3
023
0
4
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
Stem-and-leaf of Accnt
Leaf Unit = 0.10
19
(8)
23
21
19
15
15
13
11
11
11
11
10
9
9
8
8
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
HI
111111111112 2333444
55556888
00
79
0023
23
78
8
2
83
Stem-and-leaf of Total
Leaf Unit = 1.00
1
3
5
11
17
21
(5)
24
14
10
6
5
4
0 1
0 33
0 45
0 666677
0 888999
1 0000
1 33333
1 4444445555
1 6677
1 8999
2 0
2 3
2 44
HI 30, 36
0
4
99, 105, 135, 144,
182, 220, 300
Of the 50 requests, 10 were lost. For each of the three departments, the processing times for the lost
requests are scattered throughout the distributions. The processing times for the departments do not
appear to be related to whether the request was lost or not. However, the total processing times for
the lost requests appear to be clustered towards the high side of the distribution. It appears that if the
total processing time could be kept under 17 days, 76% of the data could be maintained, while
reducing the number of lost requests to 1.
b.
For the Marketing department, if the maximum processing time was set at 6.5 days, 78% of the
requests would be processed, while reducing the number of lost requests by 4. For the Engineering
department, if the maximum processing time was set at 7.0 days, 72% of the requests would be
processed, while reducing the number of lost requests by 5. For the Accounting department, if the
maximum processing time was set at 8.5 days, 86% of the requests would be processed, while
reducing the number of lost requests by 5.
c.
Using MINITAB, the summary statistics are:
Descriptive Statistics: REQUEST, MARKET, ENGINEER, ACCOUNT
Variable
MARKET
ENGINEER
ACCOUNT
TOTAL
N
Mean
50 4.766
50 5.044
50 3.652
50 13.462
StDev
2.584
3.835
6.256
6.820
Minimum
0.100
0.400
0.100
1.800
Q1
2.825
1.775
0.200
8.075
Median
Q3
5.400 6.250
4.500 7.225
0.800 3.725
13.750 16.600
Copyright ยฉ 2018 Pearson Education, Inc.
Maximum
11.000
14.400
30.000
36.200
84
Chapter 2
d.
The z-scores corresponding to the maximum time guidelines developed for each department and the
total are as follows:
Marketing: z ๏ฝ
Engineering: z ๏ฝ
x ๏ญ x 7.0 ๏ญ 5.04
๏ฝ
๏ฝ .51
s
3.84
Accounting: z ๏ฝ
x ๏ญ x 8.5 ๏ญ 3.65
๏ฝ
๏ฝ .77
s
6.26
Total: z ๏ฝ
e.
x ๏ญ x 6.5 ๏ญ 4.77
๏ฝ
๏ฝ .67
s
2.58
x ๏ญ x 17 ๏ญ 13.46
๏ฝ
๏ฝ .52
s
6.82
To find the maximum processing time corresponding to a z-score of 3, we substitute in the values of
z, x , and s into the z formula and solve for x.
z๏ฝ
x๏ญx
๏ x ๏ญ x ๏ฝ zs ๏ x ๏ฝ x ๏ซ zs
s
Marketing:
x ๏ฝ 4.77 ๏ซ 3(2.58) ๏ฝ 4.77 ๏ซ 7.74 ๏ฝ 12.51
None of the orders exceed this time.
Engineering:
x ๏ฝ 5.04 ๏ซ 3(3.84) ๏ฝ 5.04 ๏ซ 11.52 ๏ฝ 16.56
None of the orders exceed this time.
These both agree with both the Empirical Rule and Chebyshev's Rule.
Accounting:
x ๏ฝ 3.65 ๏ซ 3(6.26) ๏ฝ 3.65 ๏ซ 18.78 ๏ฝ 22.43
One of the orders exceeds this time or 1/50 = .02.
Total:
x ๏ฝ 13.46 ๏ซ 3(6.82) ๏ฝ 13.46 ๏ซ 20.46 ๏ฝ 33.92
One of the orders exceeds this time or 1/50 = .02.
These both agree with Chebyshev's Rule but not the Empirical Rule. Both of these last two
distributions are skewed to the right.
f.
Marketing:
x ๏ฝ 4.77 ๏ซ 2(2.58) ๏ฝ 4.77 ๏ซ 5.16 ๏ฝ 9.93
Two of the orders exceed this time or 2/50 = .04.
Engineering:
x ๏ฝ 5.04 ๏ซ 2(3.84) ๏ฝ 5.04 ๏ซ 7.68 ๏ฝ 12.72
Two of the orders exceed this time or 2/50 = .04.
Accounting:
x ๏ฝ 3.65 ๏ซ 2(6.26) ๏ฝ 3.65 ๏ซ 12.52 ๏ฝ 16.17
Three of the orders exceed this time or 3/50 = .06.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
85
x ๏ฝ 13.46 ๏ซ 2(6.82) ๏ฝ 13.46 ๏ซ 13.64 ๏ฝ 27.10
Two of the orders exceed this time or 2/50 = .04.
Total:
All of these agree with Chebyshev's Rule but not the Empirical Rule.
g.
No observations exceed the guideline of 3 standard deviations for both Marketing and Engineering.
One observation exceeds the guideline of 3 standard deviations for both Accounting (#23, time =
30.0 days) and Total (#23, time = 36.2 days). Therefore, only (1/10) ๏ด 100% of the "lost" quotes
have times exceeding at least one of the 3 standard deviation guidelines.
Two observations exceed the guideline of 2 standard deviations for both Marketing (#31, time = 11.0
days and #48, time = 10.0 days) and Engineering (#4, time = 13.0 days and #49, time = 14.4 days).
Three observations exceed the guideline of 2 standard deviations for Accounting (#20, time = 22.0
days; #23, time = 30.0 days; and #36, time = 18.2 days). Two observations exceed the guideline of 2
standard deviations for Total (#20, time = 30.2 days and #23, time = 36.2 days). Therefore, (7/10) ๏ด
100% = 70% of the "lost" quotes have times exceeding at least one the 2 standard deviation
guidelines.
We would recommend the 2 standard deviation guideline since it covers 70% of the lost quotes, while
having very few other quotes exceed the guidelines.
a.
Using MINITAB, the time series plot is:
Time Series Plot of Deaths
900
800
700
600
Deaths
2.163
500
400
300
200
100
0
2003
2004
2005
2006
Year
b.
The time series plot is misleading because the information for 2006 is incomplete โ it is based on
only 2 months while all of the rest of the years are based on 12 months.
c.
In order to construct a plot that accurately reflects the trend in American casualties from the Iraq War,
we would want complete data for 2006 and information for the years 2007 through 2011.
Copyright ยฉ 2018 Pearson Education, Inc.
86
Chapter 2
2.164
a.
Using MINITAB, the time series plot of the data is:
Time Series Plot of Acquisitions
900
800
700
Acquisitions
600
500
400
300
200
100
1999
2000
1998
1997
1996
1995
1994
1993
1991
1992
1990
1989
1988
1987
1986
1985
1984
1983
1981
1982
1980
0
Year
b.
To find the percentage of the sampled firms with at least one acquisition, we divide number with
acquisitions by the total sampled and then multiply by 100%. For 1980, the percentage of firms with at
least on acquisition is (18/1963)*100% = .92%. The rest of the percentages are found in the same
manner and are listed in the following table:
Year
Number of firms
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
TOTAL
1,963
2,044
2,029
2,187
2,248
2,238
2,277
2,344
2,279
2,231
2,197
2,261
2,363
2,582
2,775
2,890
3,070
3,099
2,913
2,799
2,778
51,567
Number with
Acquisitions
18
115
211
273
317
182
232
258
296
350
350
370
427
532
626
652
751
799
866
750
748
9,123
Percentage with
Acquisitions
.92%
5.63%
10.40%
12.48%
14.10%
8.13%
10.19%
11.01%
12.99%
15.69%
15.93%
16.36%
18.07%
20.60%
22.56%
22.56%
24.46%
25.78%
29.73%
26.80%
26.93%
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
87
Using MINITAB, the time series plot is:
Time Series Plot of Percent
30
25
Percent
20
15
10
5
1999
2000
1998
1997
1996
1995
1994
1993
1991
1992
1990
1989
1988
1987
1986
1985
1984
1983
1981
1982
1980
0
Year
2.165
c.
In this case, the two plots are almost the same. In general, the time series plot of the percents would be
more informative. By changing the observations to percents, one can compare time periods with
different sample sizes on the same basis.
a.
Since the mean is greater than the median, the distribution of the radiation levels is skewed to the
right.
b.
x ๏ฑ s ๏ 10 ๏ฑ 3 ๏ (7, 13) ; x ๏ฑ 2 s ๏ 10 ๏ฑ 2(3) ๏ (4, 16) ; x ๏ฑ 3s ๏ 10 ๏ฑ 3(3) ๏ (1, 19)
Interval
(7, 13)
(4, 16)
(1, 19)
Chebyshev's
At least 0
At least 75%
At least 88.9%
Empirical
๏ป68%
๏ป95%
๏ป100%
Since the data are skewed to the right, Chebyshev's Rule is probably more appropriate in this case.
c.
The background level is 4. Using Chebyshev's Rule, at least 75% or .75(50) ๏ป 38 homes are above
the background level. Using the Empirical Rule, ๏ป 97.5% or .975(50) ๏ป 49 homes are above the
background level.
d.
z๏ฝ
x ๏ญ x 20 ๏ญ 10
๏ฝ
๏ฝ 3.333
s
3
It is unlikely that this new measurement came from the same distribution as the other 50. Using
either Chebyshev's Rule or the Empirical Rule, it is very unlikely to see any observations more than 3
standard deviations from the mean.
Copyright ยฉ 2018 Pearson Education, Inc.
88
Chapter 2
2.166
a.
Using MINITAB, a pie chart of the data is:
Pie Chart of PREVUSE
Category
NEVER
USED
USED
28.8%
NEVER
71.2%
From the chart, 71.2% or .712 of the sampled physicians have never used ethics consultation.
b.
Using MINITAB, a pie chart of the data is:
Pie Chart of FUTUREUSE
Category
NO
YES
NO
19.5%
YES
80.5%
From the chart, 19.5% or .195 of the sampled physicians state that they will not use the services in the
future.
c.
Using MINITAB, the side-by-side pie charts are:
Pie Chart of PREVUSE
MED
SURG
Category
NEVER
USED
USED
27.9%
USED
29.3%
NEVER
70.7%
NEVER
72.1%
Panel variable: SPEC
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
89
The proportion of medical practitioners who have never used ethics consultation is .707. The
proportion of surgical practitioners who have never used ethics consultation is .721. These two
proportions are almost the same.
d.
Using MINITAB, the side-by-side pie charts are:
Pie Chart of FUTUREUSE
MED
SURG
Category
NO
YES
NO
17.3%
NO
23.3%
YES
76.7%
YES
82.7%
Panel variable: SPEC
The proportion of medical practitioners who will not use ethics consultation in the future is .173. The
proportion of surgical practitioners who will not use ethics consultation in the future is .233. The
proportion of surgical practitioners who will not use ethics consultation in the future is greater than
that of the medical practitioners.
Using MINITAB, the relative frequency histograms of the years in practice for the two groups of
doctors are:
Histogram of YRSPRAC
0.0
NO
25
7.5
15.0
22.5
30.0
37.5
YES
20
Percent
e.
15
10
5
0
0.0
7.5
15.0
22.5
30.0
37.5
YRSPRAC
Panel variable: FUTUREUSE
The researchers hypothesized that older, more experienced physicians will be less likely to use ethics
consultation in the future. From the histograms, approximately 38% of the doctors that said โnoโ
have more than 20 years of experience. Only about 19% of the doctors that said โyesโ had more than
20 years of experience. This supports the researchersโ assertion.
Copyright ยฉ 2018 Pearson Education, Inc.
90
Chapter 2
f.
Using MINITAB, the output is:
Descriptive Statistics: YRSPRAC
Variable
YRSPRAC
N
112
N*
6
N for
Minimum Median
1.000 14.000
Mean
14.598
Maximum
40.000
Mode
14, 20, 25
Mode
9
The mean is 14.598. The average length of time in practice for this sample is 14.598 years. The
median is 14. Half of the physicians have been in practice less than 14 years and half have been in
practice longer than 14 years. There are 3 modes: 14, 20, and 25. The most frequent years in practice
are 14, 20, and 25 years.
g.
Using MINITAB, the results are:
Descriptive Statistics: YRSPRAC
Variable
YRSPRAC
FUTUREUSE
NO
YES
N
21
91
N*
2
4
Mean
16.43
14.176
Minimum
1.00
1.000
Median
18.00
14.000
Maximum
35.00
40.000
Mode
25
14, 20
N for
Mode
5
8
The mean for the physicians who would refuse to use ethics consultation in the future is 16.43. The
average time in practice for these physicians is 16.43 years. The median is 18. Half of the physicians
who would refuse ethics consultation in the future have been in practice less than 18 years and half
have been in practice more than 18 years. The mode is 25. The most frequent years in practice for
these physicians is 25 years.
h.
From the results in part g, the mean for the physicians who would use ethics consultation in the future
is 14.176. The average time in practice for these physicians is 14.176 years. The median is 14. Half
of the physicians who would use ethics consultation in the future have been in practice less than 14
years and half have been in practice more than 14 years. There are 2 modes: 14 and 20. The most
frequent years in practice for these physicians are 14 and 20 years.
i.
The results in parts g and h confirm the researchersโ theory. The mean, median and mode of years in
practice are larger for the physicians who would refuse to use ethics consultation in the future than
those who would use ethics consultation in the future.
j.
Using MINITAB, the results are:
Descriptive Statistics: YRSPRAC
Variable
YRSPRAC
N
112
N*
6
Mean
14.598
StDev
9.161
Variance
83.918
Range
39.000
The range is 39. The difference between the largest years in practice and the smallest years in
practice is 39 years. The variance is 83.918 square years. The standard deviation is 9.161 years.
k.
Using MINITAB, the results are:
Descriptive Statistics: YRSPRAC
Variable
YRSPRAC
FUTUREUSE
NO
YES
N
21
91
N*
2
4
Mean
16.43
14.176
StDev
10.05
8.950
Variance
100.96
80.102
Range
34.00
39.000
For the physicians who would refuse to use ethics consultation in the future, the standard deviation is
10.05 years.
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
91
l.
For the physicians who would use ethics consultation in the future, the standard deviation is 8.95
years.
m.
The variation in the length of time in practice for the physicians who would refuse to use ethics
consultation in the future is greater than that for the physicians who would use ethics consultation in
the future.
n.
Using MINITAB, the scatterplot of the data is:
Scatterplot of YRSPRAC vs EDHRS
40
YRSPRAC
30
20
10
0
0
200
400
600
800
1000
EDHRS
There does not appear to be much of a relationship between the years of experience and the amount
of exposure to ethics in medical school.
o.
Using MINITAB, a boxplot of the amount of exposure to ethics in medical school is:
Boxplot of EDYHS
0
200
400
600
800
1000
EDHRS
The one data point that is an extreme outlier is the value of 1000.
Copyright ยฉ 2018 Pearson Education, Inc.
92
Chapter 2
p.
After removing this data point, the scatterplot of the data is:
Scatterplot of YRSPRAC2 vs EDHRS2
40
YRSPRAC2
30
20
10
0
0
10
20
30
40
50
60
70
80
90
EDHRS2
With the data point removed, there now appears to be a negative trend to the data. As the amount of
exposure to ethics in medical school increases, the years of experience decreases.
2.167
a.
Both the height and width of the bars (peanuts) change. Thus, some readers may tend to equate the
area of the peanuts with the frequency for each year.
b.
Using MINITAB, the frequency bar chart is:
Chart of Peanut
5
Peanut
4
3
2
1
0
1975
1980
1985
1990
1995
2000
2005
2010
Year
2.168
a.
Clinic A claims to have a mean weight loss of 15 during the first month and Clinic B claims to have a
median weight loss of 10 pounds in the first month. With no other information, I would choose
Clinic B. It is very likely that the distributions of weight losses will be skewed to the right โ most
people lose in the neighborhood of 10 pounds, but a couple might lose much more. If a few people
lost much more than 10 pounds, then the mean will be pulled in that direction.
b.
For Clinic A, the median is 10 and the standard deviation is 20. For Clinic B, the mean is 10 and the
standard deviation is 5.
For Clinic A:
The mean is 15 and the median is 10. This would indicate that the data are skewed to the right.
Thus, we will have to use Chebyshevโs Rule to describe the distribution of weight losses.
x ๏ฑ 2 s ๏ 15 ๏ฑ 2(20) ๏ 15 ๏ฑ 40 ๏ (๏ญ25, 55)
Using Chebyshevโs Rule, we know that at least 75% of all weight losses will be between -25 and 55
Copyright ยฉ 2018 Pearson Education, Inc.
Methods for Describing Sets of Data
93
pounds. This means that at least 75% of the people will have weight losses of between a loss of 55
pounds to a gain of 25 pounds. This is a very large range.
For Clinic B:
The mean is 10 and the median is 10. This would indicate that the data are symmetrical. Thus, the
Empirical Rule can be used to describe the distribution of weight losses.
x ๏ฑ 2 s ๏ 10 ๏ฑ 2(5) ๏ 10 ๏ฑ 10 ๏ (0, 20)
Using the Empirical Rule, we know that approximately 95% of all weight losses will be between 0
and 20 pounds. This is a much smaller range than in Clinic A.
I would still recommend Clinic B. Using Clinic A, a person has the potential to lose a large amount
of weight, but also has the potential to gain a relatively large amount of weight. In Clinic B, a person
would be very confident that he/she would lose weight.
c.
2.169
One would want the clients selected for the samples in each clinic to be representative of all clients in
that clinic. One would hope that the clinic would not choose those clients for the sample who lost the
most weight just to promote their clinic.
First we make some preliminary calculations.
Of the 20 engineers at the time of the layoffs, 14 are 40 or older. Thus, the probability that a randomly
selected engineer will be 40 or older is 14/20 = .70. A very high proportion of the engineers is 40 or over.
In order to determine if the company is vulnerable to a disparate impact claim, we will first find the median
age of all the engineers. Ordering all the ages, we get:
29, 32, 34, 35, 38, 39, 40, 40, 40, 40, 40, 41, 42, 42, 44, 46, 47, 52, 55, 64
The median of all 20 engineers is
40 ๏ซ 40 80
๏ฝ
๏ฝ 40
2
2
Now, we will compute the median age of those engineers who were not laid off. The ages underlined
40 ๏ซ 40 80
๏ฝ
๏ฝ 40 .
above correspond to the engineers who were not laid off. The median of these is
2
2
The median age of all engineers is the same as the median age of those who were not laid off. The median
40 ๏ซ 41 81
๏ฝ
๏ฝ 40.5 , which is not that much different from the median age of those
age of those laid off is
2
2
not laid off. In addition, 70% of all the engineers are 40 or older. Thus, it appears that the company would
not be vulnerable to a disparate impact claim.
2.170
Answers will vary. The graph is made to look like the amount of money spent on education has risen
dramatically from 1980 to 2000, but the 4th grade reading scores have not increased at all. The graph does
not take into account that the number of school children has also increased dramatically in the last 20 years.
A better portrayal would be to look at the per capita spending rather than total spending.
2.171
There is evidence to support this claim. The graph peaks at the interval above 1.002. The heights of the
bars decrease in order as the intervals get further and further from the peak interval. This is true for all bars
except the one above 1.000. This bar is greater than the bar to its right. This would indicate that there are
more observations in this interval than one would expect, suggesting that some inspectors might be passing
rods with diameters that were barely below the lower specification limit.
Copyright ยฉ 2018 Pearson Education, Inc.
Document Preview (84 of 4060 Pages)
User generated content is uploaded by users for the purposes of learning and should be used following SchloarOn's honor code & terms of service.
You are viewing preview pages of the document. Purchase to get full access instantly.
-37%
Statistics for Business and Economics, 13th Edition Solution Manual
$18.99 $29.99Save:$11.00(37%)
24/7 Live Chat
Instant Download
100% Confidential
Store
Lucas Clark
0 (0 Reviews)
Best Selling
The World Of Customer Service, 3rd Edition Test Bank
$18.99 $29.99Save:$11.00(37%)
Chemistry: Principles And Reactions, 7th Edition Test Bank
$18.99 $29.99Save:$11.00(37%)
Test Bank for Hospitality Facilities Management and Design, 4th Edition
$18.99 $29.99Save:$11.00(37%)
Solution Manual for Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th Edition
$18.99 $29.99Save:$11.00(37%)
Data Structures and Other Objects Using C++ 4th Edition Solution Manual
$18.99 $29.99Save:$11.00(37%)
2023-2024 ATI Pediatrics Proctored Exam with Answers (139 Solved Questions)
$18.99 $29.99Save:$11.00(37%)