MGMT 650 Summer 2022

Week 2 Homework Questions

(Last updated 3/13/2022)

9/6/2022

You are a statistician contracted to sample and analyze weights of the 5 lb bags of coffee. You collect the following data:

5.30

4.67

4.93

5.51

5.99

5.34

5.04

4.42

6.14

4.28

4.84

5.61

5.08

5.15

5.28

5.78

4.47

5.08

4.51

4.60

5.15

5.45

4.39

4.86

5.38

5.53

4.14

4.54

4.57

5.82

3.83

4.90

6.00

5.32

5.59

4.26

5.18

5.30

5.40

4.67

5.56

4.85

4.46

5.74

4.90

5.13

5.60

4.77

5.34

4.25

4.77

5.40

5.03

5.73

5.08

6.18

For all questions, write your answers in the yellow cells.

1)

Compute the following statistics using Excel functions so that Excel calculates for you:

Mean

Median

2)

First quartile using QUARTILE.EXC

Third quartile using QUARTILE.EXC

Interquartile range (IQR)

3)

Maximum

Minimum

Range

4)

Variance

Standard deviation

Coeficient of Variation (CV)

5)

For the standard deviation, when do you use STDEV.S and when do you use STDEV.P?

6)

Explain why the standard deviation is a better measure to use than the variance.

The coefficient of variation (CV) is a measure of relative variability equal to the ratio between the standard deviation

divided by the mean, and formatted to %. It is regularly used to compare risk (volatility) in investing, and is especially

useful in to compare data on different scales or with different units of measure. Consider weights of 10 oz, 13.4 oz, 15.1

oz with SD of 2.597 oz. The corresponding weights in pounds 0.625 lb, 0.838 lb, 0.944 lb have SD 0.162 lb. The SDs

are not equal, yet the sample and its variability are the same. How can we compare variabilities? The CV, unlike other

measures of variability, does not depend on the units of measure. The units are divided out in dividing the standard

deviation by the mean. For a “rule of thumb”, a CV of greater than 5% is considered significant. So, the CV is also used

to assess data with no prior history to compare to evaluate any trends.

When is the Coefficient of Variation (CV) especially useful?

9/6/2022

7)

As a contracted statistician, you have no prior experience with the company’s product.

The company has no prior process history and data with which to compare the sample to evaluate any trends.

The mode is not especially useful for this data, but explain why there is or is not

a concern with bag weight in terms of the other measures of central tendency.

8)

What statistic should you use to assess variability of the product?

Explain why there is or is not a concern with the variability of the product.

9/6/2022

coffee. You collect the following data:

4.75

5.28

4.99

4.96

4.98

5.04

5.19

5.52

4.69

4.32

4.82

5.49

4.18

5.16

4.34

4.67

between the standard deviation

ty) in investing, and is especially

sider weights of 10 oz, 13.4 oz, 15.1

44 lb have SD 0.162 lb. The SDs

variabilities? The CV, unlike other

ded out in dividing the standard

significant. So, the CV is also used

5.13

4.49

4.86

5.03

4.53

4.29

5.29

5.92

9/6/2022

Pivot tables and other descriptive statistics are developed from raw data. In this problem, you must obtain or create a raw da

to obtain pivot tables. They must be used to understand the data. Pick data that interests you. Make sure it is available to th

data spreadsheet must have at least 50 rows. It must contain at least two categorical variables and two numerical variables.

See the tab ‘Sample Data for Pivot Table’ for a sample file and and explanation of one way and two way tables

9) Show the data spreadsheet and briefly describe its source and what it contains.

Copy the data into this file, preferably as a new worksheet

10) Create a one way pivot table or frequency distribution for a categorical variable. Use counts for the variable. In one para

table to explain the significance of your data. Use proportions, bar and pie charts to reinforce your conclusions.

11) Create a one way table for a categorical variable and a numerical variable. Use sum for the numerical variable. In one pa

the significance of your data. Use any appropriate charts to make your points.

12) Create a two way table for both categorical variables. In one paragraph, explain the significance of your data. If the table

there may be relationships between the variables, explain why.

problem, you must obtain or create a raw data file and use it

interests you. Make sure it is available to the public. The

rical variables and two numerical variables.

one way and two way tables

t contains.

ble. Use counts for the variable. In one paragraph, use the

s to reinforce your conclusions.

se sum for the numerical variable. In one paragraph, explain

lain the significance of your data. If the table shows that

Anticipated

Employment Expected Salary in 5

ID Num Gender

Status

Salary

Years

Spending

ID01

M

PT

45

75

200

ID02

M

FT

120

250

150

ID03

M

FT

85

120

65

ID04

M

FT

100

150

150

ID05

M

FT

80

90

300

ID06

F

FT

100

150

250

ID07

M

FT

100

125

400

ID08

M

FT

60

100

60

ID09

F

FT

50

60

180

ID10

M

PT

50

80

700

ID11

M

FT

60

100

100

ID12

M

FT

65

100

200

ID13

M

FT

80

110

300

ID14

M

FT

100

150

100

ID15

F

UN

50

100

1000

ID16

M

FT

40

65

300

ID17

F

FT

70

100

200

ID18

F

FT

60

75

150

ID19

M

FT

105

120

150

ID20

F

FT

60

100

130

ID21

F

UN

45

65

300

ID22

F

FT

60

80

200

ID23

M

FT

80

140

90

ID24

F

FT

65

85

170

ID25

M

PT

60

150

320

ID26

M

FT

65

135

200

ID27

F

FT

55

90

600

ID28

M

FT

90

130

200

ID29

M

FT

70

100

100

ID30

F

FT

45

80

160

ID31

F

PT

80

110

230

ID32

M

FT

75

150

500

ID33

M

FT

60

85

250

ID34

F

FT

75

95

150

ID35

F

FT

65

90

120

ID36

M

FT

50

85

160

ID37

F

FT

75

100

100

ID38

F

FT

60

90

160

ID39

M

FT

90

160

130

ID40

M

PT

55

85

110

This is a sample raw data file resulting fro

survey. We can create some tables usin

Excel/Pivot Table. Letâ€™s construct a one

table for gender and a two way table for g

and employment status.

For the one way table, click on Insert/Pivo

Table. Select range and highlight all the

columns. Select new worksheet. Drag g

which is to the far right in the pivot table f

down to rows. Next drag gender again d

values. It should show â€œcountâ€. If not, us

drop down arrow to get value field setting

select â€œcountâ€.

If you want to show a bar chart, click on I

and look for the picture of a bar chart and

it. Also for a pie chart.

For a two way table, same initial steps as

Drag gender down to rows, drag employm

status down to columns, and drag either

down to values. Again make sure count

A two way table can be used to examine

two categorical variables may be related.

Next letâ€™s construct a one way table that s

the total of all the expected salary of male

females. Same initial steps, drag gender

to rows, drag expected salary down to va

and make sure it shows sum. If not chan

the value field setting.

mple raw data file resulting from a

e can create some tables using

Table. Letâ€™s construct a one way

nder and a two way table for gender

ment status.

way table, click on Insert/Pivot

ect range and highlight all the

Select new worksheet. Drag gender

he far right in the pivot table field list

ws. Next drag gender again down to

hould show â€œcountâ€. If not, use the

arrow to get value field settings and

ntâ€.

to show a bar chart, click on Insert

r the picture of a bar chart and select

a pie chart.

ay table, same initial steps as before.

er down to rows, drag employment

n to columns, and drag either variable

ues. Again make sure count shows.

able can be used to examine whether

ical variables may be related.

onstruct a one way table that shows

all the expected salary of males and

ame initial steps, drag gender down

ag expected salary down to values

ure it shows sum. If not change it in

eld setting.

One way table

Count of Gender

Gender

F

M

Grand Total

Total

16

24

40

Two way table

Count of Gender

Gender

F

M

Grand Total

Employment Status

FT

PT

UN

13

1

20

4

33

5

Summary table

sum

Gender

F

M

Grand Total

Total

1015

1785

2800

2

2

Grand Total

16

24

40

13

Frequency Distribution:

Here is a hypothetical list of the number of caramel popcorn

cans sold in 70 scout troops in Maryland. Use Excel’s

capability (=FREQUENCY()) to create a frequency distribution

Number of Caramel popcorn cans sold in each troop

174

105 103

105

148 158 121

153

147 132

110

115 192 158

183

199 174

107

179 183 129

198

171 120

163

163 108 184

180

114 107

107

152 137 184

190

175 156

143

142 152 182

183

119 165

134

172 145 184

MIN

MAX

Use the =MIN() function to find the lowest num

Use the =MAX() function to find the highest nu

Subtract the lowest number of cans sold from the highest number of

The range of number of cans sold is:

14

Decide how many bins to use. Study the data and pick a number bet

See https://www.statisticshowto.com/choose-bin-sizes-statistics/

In one paragraph below explain why you selected the numbers of bin

15

Now divide the range of the cans sold by the number of bins to find t

Size of each bin is

16

Create your bins:

Start with the minimum number of cans sold and add the size of the

Therefore,

plus

is the highest number that the

The =FREQUENCY() function uses the highest number of each bin wh

The next bin highest number starts with the first bin’s highest numbe

Therefore, the second bin begins with

and adds the bin size

Continue adding to get the Bins array for the =FREQUENCY() functio

Follow the instructions in the youtube videos to use the =FREQUENC

You know that you have correctly used the =FREQUENCY() function i

Min

Max

Bins:

Frequency:

Note: Some versions of Excel treat the last bin a bit differently, so the FREQUENCY function may

If you get that, don’t worry about it, we will accept either answer.

mel popcorn

y distribution

old in each troop

130

118

196

149

194

119

134

186

200

189

126

142

168

170

157

140

150

175

103

160

113

o find the lowest number of popcorn cans sold.

to find the highest number of popcorn cans sold.

e highest number of cans sold:

nd pick a number between 5 and 10

n-sizes-statistics/

d the numbers of bins (between 5 and 10)

mber of bins to find the size that each bin should be

est number that the first bin should be.

umber of each bin when calculating the number in each bin.

t bin’s highest number and adds the size of the bins.

to get

use the =FREQUENCY() array function.

EQUENCY() function if Excel automatically puts {} around the function.

he FREQUENCY function may show the last bin as one higher than it should be.

18

As part of the marketing group of a film company, you a

You ask questions of customers who exit the theatre. Fr

165 are 10 to 14, 25 are 15 to 21, and 174 are older tha

a) Make a frequency table of these categorical data.

b) Make a relative frequency table.

c) Make a bar chart using counts in the frequency table.

d) Would a bar chart of relative frequencies look any different?

e) Make a pie chart.

Write a few sentences summarizing the distribution demonstrated by

19

In addition to age grouping information, the audiences interviewed were a

17

never

once

more than once

under 6

42

4

3

6 to 9

62

21

4

10 to 14 15 to 21 over 21

90

20

152

41

5

17

34

0

5

a) Find the marginal distributions of their previous viewing of the mov

b) Verify that the marginal distribution of the ages is the same as that

20

Continue with the data from question 17

c) Find column percentages.

d) Looking at these percentages, does the distribution of how many tim

e) Make a stacked bar chart showing the distribution of viewings for e

f) What is the percentage of all audience members who are over 21? A

g) If these proportions are not the same, what does that mean with r

m company, you are asked to find out the age distribution of the audience of the latest

xit the theatre. From 500 responses, you find that 49 are younger than 6 years old, 87

nterviewed were also asked if they had seen the movie before (Never, Once, More than

viewing of the movie. Margin means total. Create a new column labeled “total”.

s the same as that given previously. Create a bottom row labeled “total”.

on of how many times someone has seen the movie look the same for each age group?

who are over 21? And what percentage of audience members who saw the movie more

s that mean with respect to the variables “age” and “frequency of viewing movies”?

me for each age group? What can you assume from these percentages?

Purchase answer to see full

attachment