The main goal of this project is to help students to build skills in statistical analysis by applying the descriptive statistics tools to estimate the mean COVID-19 Cases per 100,000 people (

C19CP100000

) and the mean COVID-19 Proportion of Total Deaths in Total Cases (

C19PTDITC

) for each of your two selected US selected states, and then use those estimates and the inferential statistics to test the difference in COVID-19 incidences across the two selected states. Students are expected to write their final research report which must describe the population of interest to the analysis, the data collection procedure, the implementation of the statistical procedure to estimate the population parameters (mean C19CP100000 and the mean C19PTDITC) using the sample data, the interpretation of the results, and the policy recommendations.

Project Goals

The main goal of this project is to help students to build skills in statistical analysis by applying the

descriptive statistics tools to estimate the mean COVID-19 Cases per 100,000 people

(C19CP100000) and the mean COVID-19 Proportion of Total Deaths in Total Cases (C19PTDITC)

for each of your two selected US selected states, and then use those estimates and the inferential

statistics to test the difference in COVID-19 incidences across the two selected states. Students are

expected to write their final research report which must describe the population of interest to the

analysis, the data collection procedure, the implementation of the statistical procedure to estimate

the population parameters (mean C19CP100000 and the mean C19PTDITC) using the sample data,

the interpretation of the results, and the policy recommendations.

Learning objectives

Upon completing this research project, the student will be able to:

– Collect and use data in the decision-making process;

– Calculate descriptive statistics;

– Use the Central Limit Theorem to identify the probability distributions of statistics;

– Conduct statistical inference to determine behaviors of population parameters using sample data;

– Interpret the results of analysis; and

– Make policy recommendations

Problem Statement

The coronavirus disease 2019 (COVID-19), which appeared first in China in late 2019, has spread

quickly across the world, causing in its way significant health, economic, demographic, and social

disruptions. What was initially seen as a largely China-centric shock has ballooned to full blown

global crisis. On March 11, 2020, the World Health Organization (WHO) declared COVID-19 a global

pandemic. COVID-19 has brought forth new challenges such as social distancing, requirement to

wear masks in public place, teleworking, prohibition of large-scale social events, travel restrictions

and others. Overcoming those challenges has proved to be the best way to contain the spread of the

pandemic and protect lives. In the particular case of the United States, each state has set forth

strategies to contain the spread of the disease and to reduce the number of deaths.

Project Description

You are tasked with determining whether or not there exits difference of COVID-19 incidences

across two US states of your choice using COVID-19 data, namely, Cases per 100,000 people,

Total Deaths, and Total Cases.

To complete your project, you will use secondary; CDC COVID Data Tracker Ã¢â‚¬â€œ 2020

(https://covid.cdc.gov/covid-data-tracker/#county-map) to estimate the difference in COVID-19

incidences across two states. You will also have to test the hypothesis of no difference in

COVID-19 incidences across two states.

Steps for conducting the statistical analysis are described below.

1. Data collection and visualization

For this project, you need to download COVID-19 data using the link provided above. Once on

the data page, you will be prompted to enter your state. Data on counties of the state will be

displayed. Your variables of interest are Cases per 100,000, Total Cases and Total Deaths. Select

a simple random sample which must be the third of the total number of counties. If the third of

counties is less than 20 counties, increase the number of counties to 20 by randomly selecting the

missing number. If the total number of counties is less than 20, please choose a different state.

Please follow the same procedure to select the sample for the other state. Next, plot the two

samples in the same chart (visualization) to detect whether or not there exist differences in Cases

per 100,000 people, Total Cases, and in Total deaths across the two states. The visualizations

should be presented using SPSS visualizations.

To complete the SPSS visualization, each student must complete five modules of Statistics 101

from the following link https://cognitiveclass.ai/courses/statistics101/

Upon the completion of Statistics 101, each student must print the certificate of completion and

attach it as an appendix to the written project report.

2. Estimation of the mean, variance and standard deviation for each of the two COVID19 variables

The estimates of the means C19CP100000, their standard deviations as well as their sample sizes

are the inputs needed to calculate point estimate and the interval estimation of C19CP100,000

differentials (use the confidence level of your choice, preferably between 95% and 99%).

Likewise, the estimates of means C19PTDITC, their standard deviations as well as their sample

sizes are the inputs needed to calculate point estimate and the interval estimation of C19PTDTC

differentials (use the confidence level of your choice, preferably between 95% and 99%). If the

sample size of each state is 30 or more, assume that the standard deviation from the sample is the

same as the population standard deviation and use the Z distribution to construct the confidence

interval. But, if the sample size of your group is less than 30, use the t distribution to construct

the confidence interval.

Next, reduce the margin of error by 75% and calculate the sample size needed to achieve such

target. Finally, reconstruct the confidence intervals of estimates of C19CP100,000 differential

that would result from such simple sample. Repeat the same procedure for the C19PTDITC

differentials.

3. Hypothesis testing of the non-existence of COVID-19 Incidences differentials

In this step, the hypothesis testing procedure will be implemented to test the nonexistence of

COVID-19 incidences differentials for each of the two variables. The hypothesis of nonexistence of COVID-19 incidences differentials will be tested against the alternative hypothesis

of existence of COVID-19 incidences differentials. This step is crucial since it helps to determine

whether or not the observed estimated value of COVID-19 incidences differentials is due to the

random errors. Choose the confidence level between 95% and 99% to conduct your hypothesis

testing. Also, follow the same guidelines highlighted in point 3 to determine the type of

distribution to be used in hypothesis testing. The hypothesis testing procedure is summarized

below.

– Determine the null and alternative hypotheses.

– Choose the significance of level (preferably, set ÃŽÂ± = 0.05).

– Validate the assumptions of the hypothesis test, identify the appropriate test statistic, and

compute its value (compute P-value)

– Using the graphs to determine if you should be conducting a two-sample test of the mean with

equal or unequal variances.

– Compare the value of your statistic to the theoretical value (from the statistical Tables)

– Make a decision to reject or fail to reject the null hypothesis

– State the conclusion

5. Interpretation of results

Describe the meaning of your results and how they can be used for policy recommendations.

Project Grading/Evaluation

– This project will be graded out of 100 points and will contribute 10% to your final grade in this course.

– The key success factor for this project is to use the correct and cleaned data and demonstrate a

systematic approach to data analysis by using the appropriate tools.

– This project should be completed in Excel or SPSS (or Tableau). There is a free version of SPSS available

for STAT 101 (or Tableau) on the IBM cognitive class. You should complete the course and prove its

completion by attaching your certificate of completion to the final report. You should also explain the

rationale for adopting a particular method of analysis.

– A typed; multiple line-space (at 1.15) paper that contain an introduction, a section describing your

methodology, a data analysis section and a conclusion section that summarizes the results of your

analysis. The formulas used should be shown in detail, and the calculations shown clearly. All cited work

and source of information must be listed in the reference list.

– You should each keep a log on what you have been assigned to do and what you have accomplished

– The project will be evaluated by me and you will receive a discounted grade if there are significant

discrepancies.

– The assessment rubric is attached.

Format

Each project will be 5 pages maximum (appendix not included) and must be written using the following

guidelines and contents:

– Title page (Include project title and your name)

– Introduction: Problem of the propose study, purpose and justification of the study

– Methodology

– Data Collection and Cleaning

– Data analysis

– Interpretation of results

– Findings and conclusion.

– Appendices: Tables, Figures. Certificate of completion-Statistics 191,

– References

Font must be Time New Roman (or Calibri) and Font size must be 12. The line spacing must be multiple

at 1.15. The spacing before must be 6 pt and the spacing after must be 6 pt.

The project must be written using the MLA style.

Purchase answer to see full

attachment