Description

Statistic Homework Sheet…MUST HAVE EXCEL SHEET. NO HANDWRITTEN WORK ACCEPTED

For the module 05 Homework, please download the Microsoft Word document using the link below and follow the directions included in the assignment.

Module 05 Homework assignment

Use downloaded documents as a guide to complete assignment

Name:______________________________________

Module 5 Homework Assignment

The paired data below consists of test scores and hours of preparation for 5 randomly selected

students. Use this data set to answer the questions below:

x Hours of preparation 5 2 9 6 10

y Test score

64 48 72 73 80

1. Use the given data to find the correlation coefficient r, regression equation and scatter plot in MS

Excel.

Solution:

Instructor Comments:

2. Based on the linear correlation coefficient r, is this a good model? Explain.

Solution:

Instructor Comments:

3. What is the best predicted test score for a student who spent 7 hours preparing for the test?

Solution:

Instructor Comments:

4. Find the standard error se . Use formula or MS Excel.

Solution:

Instructor Comments:

5. Find the 99% prediction interval for the test score of a person who spent 7 hours preparing for the

test given that E=34.677. Interpret the result.

Solution:

Instructor Comments:

6. Find the explained variation.

Solution:

Instructor Comments:

7. Find the unexplained variation.

Solution:

Instructor Comments:

8. Find the total variation.

Solution:

Instructor Comments:

9. Find the value of r2 and explain its meaning.

Solution:

Instructor Comments:

10. If the data point ( 3, 100) is added to the data set, how would this effect the results of the regression

analysis? Is this data point an outlier, influential point or both? Explain.

Solution:

Instructor Comments:

Name:______________________________________

Module 5 Homework Assignment

The paired data below consists of test scores and hours of preparation for 5 randomly selected

students. Use this data set to answer the questions below:

x Hours of preparation 6 3 8 5 9

y Test score

69 50 68 65 80

1. Use the given data to find the correlation coefficient r, regression equation and scatter plot in MS

Excel.

Solution:

Instructor

Comments:

Scatter Plot:

Hours of Study v Test Score

90

80

70

60

50

40

30

20

10

0

0

2

4

6

8

10

Correlation coefficient uses the Excel formula =CORREL(y-values, x-values)

Ã°Ââ€˜Å¸ = 0.9186

Regression equation:

Ã°Ââ€˜Â¦ÃŒâ€š = Ã°Ââ€˜Â0 + Ã°Ââ€˜Â1 Ã°Ââ€˜Â¥

The y-intercept of the regression equation (Ã°Ââ€˜Â0 ) is found using the Excel formula

=INTERCEPT(y-values, x-values).

The slope of the regression equation (Ã°Ââ€˜Â1 ) is found using the Excel formula

=SLOPE(y-values, x-values).

Ã°Ââ€˜Â¦ÃŒâ€š = 40.6754 + 4.1491Ã°Ââ€˜Â¥

2. Based on the linear correlation coefficient r, is this a good model? Explain.

Solution:

Instructor Comments:

This question depends on what value of alpha we

choose. LetÃ¢â‚¬â„¢s see what happens with both Ã°Ââ€ºÂ¼ =

0.05 and Ã°Ââ€ºÂ¼ = 0.01:

Ã°Ââ€ºÂ¼ = 0.05:

Using the table of critical values of Ã°Ââ€˜Å¸ from page

692 of the textbook we get a critical value of

0.878 for a sample size of 5. Since the correlation

coefficient is greater than the critical value we

conclude that there is evidence of linear

correlation between hours of study and test

scores and therefore the regression equation is a

good model and can be used for predicting test

scores based of hours of study.

Ã°Ââ€ºÂ¼ = 0.01:

Using the table of critical values of Ã°Ââ€˜Å¸ from page

692 of the textbook we get a critical value of

0.959 for a sample size of 5. Since the correlation

coefficient is less than the critical value we

conclude that there is not evidence of linear

correlation between hours of study and test

scores, and therefore the regression equation is

not a good model and cannot be used for

predicting test scores based of hours of study.

3. What is the best predicted test score for a student who spent 4 hours preparing for the test?

Solution:

Again this question depends on what value of

alpha we choose. LetÃ¢â‚¬â„¢s see what happens with

both Ã°Ââ€ºÂ¼ = 0.05 and Ã°Ââ€ºÂ¼ = 0.01:

Instructor Comments:

Ã°Ââ€ºÂ¼ = 0.05:

At this level of significance we are OK with using

the regression equation to predict test scores, so

we plug in Ã°Ââ€˜Â¥ = 4 into the regression equation to

get the predicted value Ã°Ââ€˜Â¦ÃŒâ€š:

Ã°Ââ€˜Â¦ÃŒâ€š = 40.6754 + 4.1491(4)

Ã°Ââ€˜Â¦ÃŒâ€š = 57.2719

Ã°Ââ€ºÂ¼ = 0.01:

At this significance level we are saying that there

is no correlation between hours of study and test

scores, so the regression equation is not used. In

this case our best predicted value is simply the

average of the test scores, Ã°Ââ€˜Â¦ÃŒâ€¦:

Ã°Ââ€˜Â¦ÃŒâ€¦ = 66.4

4. Find the standard error se . Use formula or MS Excel.

Solution:

Instructor Comments:

Use the Regression command from the Data

Analysis ToolPak add-in for Excel to find the

standard error (as well as explained variation,

unexplained variation, and total variation from

#Ã¢â‚¬â„¢s 6, 7, and 8).

Ã°Ââ€˜Â Ã°Ââ€˜â€™

5. Find the 99% prediction interval for the test score of a person who spent 7 hours preparing for the

test given that E=34.677. Interpret the result.

Solution:

Instructor Comments:

With the margin of error already given to us in

the problem all we need to do to construct the

confidence interval is subtract this from and add

this to the point estimate. Since we are using a

99% confidence level that would give Ã°Ââ€ºÂ¼ = 0.01 so

our point estimate is Ã°Ââ€˜Â¦ÃŒâ€¦ = 66.4:

Ã°Ââ€˜Â¦ÃŒâ€¦ Ã¢Ë†â€™ Ã°ÂÂÂ¸ < Ã°Ââ€˜Â¦ < Ã°Ââ€˜Â¦ÃŒâ€¦ + Ã°ÂÂÂ¸
66.4 Ã¢Ë†â€™ 34.677 < Ã°Ââ€˜Â¦ < 66.4 + 34.677
31.723 < Ã°Ââ€˜Â¦ < 101.077
This tells us that we can be 99% confident that a
student who studies for 4 hours will score
between 31.723 and 101.077.
6. Find the explained variation.
Solution:
Instructor Comments:
Explained Variation = 392.507
7. Find the unexplained variation.
Solution:
Instructor Comments:
Unexplained variation = 72.693
8. Find the total variation.
Solution:
Instructor Comments:
Total variation = 465.2
9. Find the value of r2 and explain its meaning.
Solution:
Instructor Comments:
Ã°Ââ€˜Å¸ 2 = 0.8437
This is called the coefficient of determination. It
is the proportion of the explained variation out of
total variation. It tells us the amount of change in
the y variable that correlates to a change in the x
variable.
10. If the data point ( 3, 100) is added to the data set, how would this effect the results of the regression
analysis? Is this data point an outlier, influential point or both? Explain.
Solution:
Instructor
Comments:
Hours of Study v Test Score
120
100
80
60
40
20
0
0
2
4
6
8
10
Adding in this point appears to significantly affect the regression line. The line
would have to move up quite a bit to accommodate for the new point added to
the scatter-plot. Shifting the regression line like this make it an influential point.
However, this point is still within the range of x values weÃ¢â‚¬â„¢re already using so it
is not an outlier. An outlier would need to be significantly outside of the current
range of values.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.918552
R Square
0.843738
Adjusted R Square
0.791651
Standard Error
4.922499 Standard Error
Observations
5
ANOVA
df
Regression
Residual
Total
SS
MS
F
Significance F
1 392.507 392.507 16.19855 0.02756
3 72.69298 24.23099
4
465.2
Explained variation
Unexplained variation
Total variation
Coefficients
Standard Error t Stat
P-value Lower 95% Upper 95%Lower 95.0%
Upper 95.0%
Intercept 40.67544 6.760089 6.016997 0.009199 19.16182 62.18906 19.16182 62.18906
X Variable 1 4.149123 1.030904 4.024742 0.02756 0.868326 7.429919 0.868326 7.429919
Data
Hours
Score
6
3
8
5
9
3
1r
b_0
b_1
69
50
68
65
80
100
0.918552
correlation coefficient
40.67544
4.149123
y-intercept
slope
90
80
70
60
50
40
30
20
10
0
0
3x
y-hat
4
57.27193
y-bar
66.4
5E
lower bd
upper bd
9 r^2
hours of study
best predicted value using regression equation
average test score
34.677
31.723
101.077
0.843738
coefficient of determination
1
2
Hours of Study v Test Score
Hours of Study v Test S
120
100
80
60
40
20
2
3
4
5
6
7
8
9
10
0
0
1
2
3
Hours of Study v Test Score
4
5
6
7
8
9
10
Purchase answer to see full
attachment