MAT 243 Project Three Summary Report

Southern New Hampshire University

1. Introduction

Discuss the statement of the problem in terms of the statistical analyses that are being performed. Be

sure to address the following:

What is the data set that you are exploring?

How will your results be used?

What type of analyses will you be running in this project?

2. Data Preparation

There are some important variables that are used in this project. Identify and explain these variables.

See the introductory section and Step 1 of the Python script to address the following:

What does the variable avg_pts_differential represent? How would you explain it to someone

who does not understand the data?

What does the variable avg_elo_n represent? How would you explain it to someone who does

not understand the data?

3. Scatterplot and Correlation for the Total Number of Wins and Average Relative Skill

You constructed a scatterplot of the total number of wins and the average relative skill to study their

correlation. You also calculated the Pearson correlation coefficient along with its P-value.

See Step 2 in the Python script to address the following items:

In general, how are data visualization techniques used to study relationship trends between two

variables?

How is the correlation coefficient used to get the strength and direction of the association

between two variables?

In this activity, you generated a scatterplot of the total number of wins and the average relative

skill. Include a screenshot of this plot in your report.

What do the scatterplot and the Pearson correlation coefficient tell you about the association

between total number of wins and average relative skill?

Is the correlation coefficient statistically significant based on the P-value? Use a 1% level of

significance.

4. Simple Linear Regression: Predicting the Total Number of Wins using Average Relative Skill

You created a simple linear regression model for the total number of wins in a regular season using the

average relative skill as the predictor variable.

See Step 3 in the Python script to address the following items:

In general, how is a simple linear regression model used to predict the response variable using

the predictor variable?

What is the equation for your model?

What are the results of the overall F-test? Summarize all important steps of this hypothesis test.

This includes:

a. Null Hypothesis (statistical notation and its description in words)

b. Alternative Hypothesis (statistical notation and its description in words)

c. Level of Significance

d. Report the test statistic and the P-value in a formatted table as shown below:

Table 1: Hypothesis Test for the Overall F-Test

Statistic

Test Statistic

P-value

Value

X.XX

X.XXXX

e. Conclusion of the hypothesis test and its interpretation based on the P-value

Based on the results of the overall F-test, can average relative skill predict the total number of

wins in the regular season?

What is the predicted total number of wins in a regular season for a team that has an average

relative skill of 1550? Round your answer down to the nearest integer.

What is the predicted number of wins in a regular season for a team that has an average relative

skill of 1450? Round your answer down to the nearest integer.

5. Scatterplot and Correlation for the Total Number of Wins and Average Points Scored

You constructed a scatterplot of total number of wins and average points scored. You also calculated the

Pearson correlation coefficient along with its P-value.

See Step 4 in the Python script to answer the following questions:

In this activity, you generated a scatterplot of the total number of wins and average points

scored. Include a screenshot of this plot in your report.

What do the scatterplot and the Pearson correlation coefficient tell you about the association

between total number of wins and average points scored?

Is the correlation coefficient statistically significant based on the P-value? Use a 1% level of

significance.

6. Multiple Regression: Predicting the Total Number of Wins using Average Points Scored and Average

Relative Skill

You created a multiple regression model with the total number of wins as the response variable, with

average points scored and average relative skill as predictor variables.

See Step 5 in the Python script to answer the following questions:

In general, how is a multiple linear regression model used to predict the response variable using

predictor variables?

What is the equation for your model?

What are the results of the overall F-test? Summarize all important steps of this hypothesis test.

This includes:

a. Null Hypothesis (statistical notation and its description in words)

b. Alternative Hypothesis (statistical notation and its description in words)

c. Level of Significance

d. Report the test statistic and the P-value in a formatted table as shown below:

Table 2: Hypothesis Test for the Overall F-Test

Statistic

Test Statistic

P-value

Value

X.XX

X.XXXX

e. Conclusion of the hypothesis test and its interpretation based on the P-value

Based on the results of the overall F-test, is at least one of the predictors statistically significant

in predicting the total number of wins in the season?

What are the results of individual t-tests for the parameters of each predictor variable? Is each

of the predictor variables statistically significant based on its P-value? Use a 1% level of

significance.

Report and interpret the coefficient of determination.

â€¢

â€¢

What is the predicted total number of wins in a regular season for a team that is averaging 75

points per game with a relative skill level of 1350?

What is the predicted total number of wins in a regular season for a team that is averaging 100

points per game with an average relative skill level of 1600?

7. Multiple Regression: Predicting the Total Number of Wins using Average Points Scored, Average

Relative Skill, Average Points Differential, and Average Relative Skill Differential

You created a multiple regression model with the total number of wins as the response variable, with

average points scored, average relative skill, average points differential, and average relative skill

differential as predictor variables.

See Step 6 in the Python script to answer the following questions:

In general, how is a multiple linear regression model used to predict the response variable using

predictor variables?

What is the equation for your model?

What are the results of the overall F-test? Summarize all important steps of this hypothesis test.

This includes:

a. Null Hypothesis (statistical notation and its description in words)

b. Alternative Hypothesis (statistical notation and its description in words)

c. Level of Significance

d. Report the test statistic and the P-value in a formatted table as shown below:

Table 3: Hypothesis Test for Overall F-Test

Statistic

Test Statistic

P-value

Value

X.XX

X.XXXX

e. Conclusion of the hypothesis test and its interpretation based on the P-value

Based on the results of the overall F-test, is at least one of the predictors statistically significant

in predicting the number of wins in the season?

What are the results of individual t-tests for the parameters of each predictor variable? Is each

of the predictor variables statistically significant based on its P-value? Use a 1% level of

significance.

Report and interpret the coefficient of determination.

What is the predicted total number of wins in a regular season for a team that is averaging 75

points per game with a relative skill level of 1350, average point differential of -5 and average

relative skill differential of -30?

What is the predicted total number of wins in a regular season for a team that is averaging 100

points per game with a relative skill level of 1600, average point differential of +5 and average

relative skill differential of +95?

8. Conclusion

Describe the results of the statistical analyses clearly, using proper descriptions of statistical terms and

concepts. Fully describe what these results mean for your scenario.

Briefly summarize your findings in plain language.

What is the practical importance of the analyses that were performed?

9. Citations

