+1(978)310-4246 credencewriters@gmail.com
  

Page 1 of 17
1.1
•
The data shown below consists of the price (in dollars) of 7 events at a local venue and
the number of people who attended. Determine if there is significant negative linear
correlation between ticket price and number of attendees. Use a significance level of
0.01 and round all values to 4 decimal places.
Ticket Price Attendence
6
151
10
144
14
148
18
143
22
140
26
140
30
138
Ho: ρ = 0
Ha: ρ < 0 Find the Linear Correlation Coefficient r= Find the p-value p-value = The p-value is • Less than (or equal to) αα • Greater than αα The p-value leads to a decision to • Do Not Reject Ho • Accept Ho • Reject Ho The conclusion is • There is a significant positive linear correlation between ticket price and attendance. • There is insufficient evidence to make a conclusion about the linear correlation between ticket price and attendance. • There is a significant negative linear correlation between ticket price and attendance. • There is a significant linear correlation between ticket price and attendance. Page 2 of 17 A study was done to look at the relationship between number of movies people watch at the theater each year and the number of books that they read each year. The results of the survey are shown below. Movies 1 8 7 8 10 0 5 7 8 Books 8 3 7 4 4 9 11 6 5 • • Find the correlation coefficient: r=r= Round to 2 decimal places. The null and alternative hypotheses for correlation are: H0:H0: == 0 H1:H1: ≠≠ 0 The p-value is: • • • Round to 4 decimal places. Use a level of significance of α=0.05α=0.05 to state the conclusion of the hypothesis test in the context of the study. a. There is statistically significant evidence to conclude that a person who watches fewer movies will read fewer books than a person who watches fewer movies. b. There is statistically significant evidence to conclude that a person who watches more movies will read fewer books than a person who watches fewer movies. c. There is statistically significant evidence to conclude that there is a correlation between the number of movies watched per year and the number of books read per year. Thus, the regression line is useful. d. There is statistically insignificant evidence to conclude that there is a correlation between the number of movies watched per year and the number of books read per year. Thus, the use of the regression line is not appropriate. r2r2 = (Round to two decimal places) Interpret r2r2 : a. There is a large variation in the number books people read each year, but if you only look at people who watch a fixed number of movies each year, this variation on average is reduced by 52%. b. Given any fixed number of movies watched per year, 52% of the population reads the predicted number of books per year. c. 52% of all people watch about the same number of movies as they read books each year. Page 3 of 17 d. • The equation of the linear regression line is: ˆyy^ = • There is a 52% chance that the regression line will be a good predictor for the number of books people read based on the number of movies they watch each year. + (Please round your answer to the nearest whole number.) Interpret the slope of the regression line in the context of the question: a. b. c. • (Please show your answers to two decimal places) Use the model to predict the number of books read per year for someone who watches 2 movies per year. Books per year = • xx For every additional movie that people watch each year, there tends to be an average decrease of 0.57 books read. As x goes up, y goes down. The slope has no practical meaning since people cannot read a negative number of books. Interpret the y-intercept in the context of the question: a. The best prediction for a person who doesn't watch any movies is that they will read 10 books each year. b. The average number of books read per year is predicted to be 10 books. c. If someone watches 0 movies per year, then that person will read 10 books this year. d. The y-intercept has no practical meaning for this study. Page 4 of 17 3. A study was done to look at the relationship between number of vacation days employees take each year and the number of sick days they take each year. The results of the survey are shown below. Vacation Days 1 11 4 8 1 4 3 6 4 Sick Days 4 1 5 2 9 2 6 2 4 a. Find the correlation coefficient: r=r= Round to 2 decimal places. b. The null and alternative hypotheses for correlation are: H0:H0: == 0 H1:H1: ≠≠ 0 The p-value is: (Round to four decimal places) c. Use a level of significance of α=0.05α=0.05 to state the conclusion of the hypothesis test in the context of the study. d. e. o There is statistically insignificant evidence to conclude that there is a correlation between the number of vacation days taken and the number of sick days taken. Thus, the use of the regression line is not appropriate. o There is statistically significant evidence to conclude that an employee who takes more vacation days will take fewer sick days than an employee who takes fewer vacation days . o There is statistically significant evidence to conclude that an employee who takes more vacation days will take more sick days than an employee who takes fewer vacation days. o There is statistically significant evidence to conclude that there is a correlation between the number of vacation days taken and the number of sick days taken. Thus, the regression line is useful. r2r2 = (Round to two decimal places) Interpret r2r2 : o o There is a large variation in the number of sick days employees take, but if you only look at employees who take a fixed number of vacation days, this variation on average is reduced by 57%. 57% of all employees will take the average number of sick days. o Given any group with a fixed number of vacation days taken, 57% of all of those employees will take the predicted number of sick days. o There is a 57% chance that the regression line will be a good predictor for the number of sick days taken based on the number of vacation days taken. Page 5 of 17 f. The equation of the linear regression line is: ˆyy^ = + xx (Please show your answers to two decimal places) g. Use the model to predict the number of sick days taken for an employee who took 8 vacation days this year. Sick Days = (Please round your answer to the nearest whole number.) h. Interpret the slope of the regression line in the context of the question: o As x goes up, y goes down. o For every additional vacation day taken, employees tend to take on average 0.59 fewer sick days. o The slope has no practical meaning since a negative number cannot occur with vacation days and sick days. i. Interpret the y-intercept in the context of the question: o o o o The average number of sick days is predicted to be 7. The best prediction for an employee who doesn't take any vacation days is that the employee will take 7 sick days. The y-intercept has no practical meaning for this study. If an employee takes no vacation days, then that employee will take 7 sick days. Page 6 of 17 4. What is the relationship between the number of minutes per day a woman spends talking on the phone and the woman's weight? The time on the phone and weight for 7 women are shown in the table below. Time 75 79 15 40 72 39 80 Pounds 152 151 99 144 156 136 166 a. Find the correlation coefficient: r=r= Round to 2 decimal places. b. The null and alternative hypotheses for correlation are: H0:H0: == 0 H1:H1: ≠≠ 0 The p-value is: (Round to four decimal places) c. Use a level of significance of α=0.05α=0.05 to state the conclusion of the hypothesis test in the context of the study. d. e. o There is statistically insignificant evidence to conclude that a woman who spends more time on the phone will weigh more than a woman who spends less time on the phone. o There is statistically significant evidence to conclude that a woman who spends more time on the phone will weigh more than a woman who spends less time on the phone. o There is statistically insignificant evidence to conclude that there is a correlation between the time women spend on the phone and their weight. Thus, the use of the regression line is not appropriate. o There is statistically significant evidence to conclude that there is a correlation between the time women spend on the phone and their weight. Thus, the regression line is useful. r2r2 = (Round to two decimal places) Interpret r2r2 : o Given any group of women who all weight the same amount, 82% of all of these women will weigh the predicted amount. o There is a 82% chance that the regression line will be a good predictor for women's weight based on their time spent on the phone. o There is a large variation in women's weight, but if you only look at women with a fixed weight, this variation on average is reduced by 82%. o 82% of all women will have the average weight. Page 7 of 17 f. The equation of the linear regression line is: ˆyy^ = + xx (Please show your answers to two decimal places) g. Use the model to predict the weight of a woman who spends 42 minutes on the phone. Weight = (Please round your answer to the nearest whole number.) h. Interpret the slope of the regression line in the context of the question: o As x goes up, y goes up. o The slope has no practical meaning since you cannot predict a women's weight. o For every additional minute women spend on the phone, they tend to weigh on averge 0.77 additional pounds. i. Interpret the y-intercept in the context of the question: o o o o The average woman's weight is predicted to be 100. The best prediction for the weight of a woman who does not spend any time talking on the phone is 100 pounds. The y-intercept has no practical meaning for this study. If a woman does not spend any time talking on the phone, then that woman will weigh 100 pounds. Page 8 of 17 1.2 2.. Here is a bivariate data set. x 23 38 36 24 -17 11 13 27 y -75 30 129 38 7 110 -48 38 Find the correlation coefficient and report it accurate to four decimal places. r= 3. Here is a bivariate data set. x y 30 26 47 20 22 -48 40 17 4 104 21 21 30 -15 23 15 14 4 Find the correlation coefficient and report it accurate to four decimal places. r= 6. The following table shows retail sales in drug stores in billions of dollars in the U.S. for years since 1995. Year 0 3 6 Retail Sales 85.851 108.426 141.781 Page 9 of 17 Year 9 12 15 Retail Sales 169.256 202.297 222.266 Let S(t)S(t) be the retails sales in billions of dollars in t years since 1995. A linear model for the data is F(t)=9.44t+84.182F(t)=9.44t+84.182. Estimate the retails sales in the U. S. in 2015. billions of dollars. Use the model to predict the year that corresponds to retails sales of $243 billion. 8. A regression analysis was performed to determine if there is a relationship between hours of TV watched per day (xx) and number of sit ups a person can do (yy). The results of the regression were: y=ax+b a=-1.386 b=23.093 r2=0.571536 r=-0.756 Use this to predict the number of sit ups a person who watches 9.5 hours of TV can do, and please round your answer to a whole number. 9. A regression was run to determine if there is a relationship between hours of study per week (xx) and the final exam scores (yy). The results of the regression were: y=ax+b a=6.309 b=29.15 r2=0.763876 r=0.874 Use this to predict the final exam score of a student who studies 3.5 hours per week, and please round your answer to a whole number. Page 10 of 17 10 Statistics students in Oxnard College sampled 11 textbooks in the Condor bookstore and recorded the number of pages in each textbook and its cost. The bivariate data are shown below: Number of Pages (xx) 817 551 951 452 794 528 300 423 854 373 792 Cost(yy) 122.04 92.12 128.12 75.24 119.28 78.36 45 71.76 128.48 53.76 115.04 A student calculates a linear model yy = xx + . (Please show your answers to two decimal places) Use the model to estimate the cost when number of pages is 702. Cost = $ (Please show your answer to 2 decimal places.) Page 11 of 17 1.3 1. A researcher wishes to examine the relationship between years of schooling completed and the number of pregnancies in young women. Her research discovers a linear relationship, and the least squares line is: ˆy=3−5xy^=3-5x where x is the number of years of schooling completed and y is the number of pregnancies. The slope of the regression line can be interpreted in the following way: • When amount of schooling increases by one year, the number of pregnancies tends to increase by 5. • When amount of schooling increases by one year, the number of pregnancies tends to decrease by 3. • When amount of schooling increases by one year, the number of pregnancies tends to decrease by 5. • When amount of schooling increases by one year, the number of pregnancies tends to increase by 3. 2. Here is a bivariate data set. x 12 38 26 29 -4 38 19 y 32 108 37 33 -32 90 79 Find the correlation coefficient and report it accurate to four decimal places. r= 3. Choose the most appropriate completion of the sentence. In order to indicate a strong correlation between variables, the correlation coefficient will be • near 1 • near -1 • near -1 or 1 Page 12 of 17 • near 1/2 • near 10 • near 0 4. A study was done asking people how much money they spend per month on their natural gas bill and how much money per month they spend on their electric bill. The correlation rr was found to be 0.94 and the p-value for correlation was 0.0003. Then a person with a high natural gas bill will also have a high electric bill. • false • true 5. A study was done on smoking and lung capacity. 200 smokers took part in a study that asked them how many cigarettes a day they smoked and then measured their lung capacity. The correlation was found to be r=−0.992r=-0.992 . Based solely on this study it can be concluded that smoking causes lung cancer. • true • false 6. A study was done that looked at how much red meat people consumed and how long they lived. The correlation rr was found to be 0.98 and the p-value for correlation was 0.0005. Then a person who does not eat red meat will live longer than a person who has an 18 ounce steak every day. • false • true 7. A researcher found the correlation between age of death and number of cigarettes smoked per day to be -0.95. Based just on this information, the researcher can justly conclude that smoking causes early death. • true • false 8. If the equation of the regression line that relates percent blood alcohol, xx , to reaction time in milliseconds, yy , is ˆy=36−1.3xy^=36-1.3x , then the slope tells us that for every Page 13 of 17 percent increase in blood alcohol, we can predict reaction time to go down by 1.3 milliseconds. • true • false 9. The table below shows the number of state-registered automatic weapons and the murder rate for several Northwestern states. 8.5 6.7 3.3 2.3 xx 11.7 yy 14 11.5 9.7 6.9 5.7 2.2 2.1 0.6 6.2 6.1 4.6 xx = thousands of automatic weapons yy = murders per 100,000 residents This data can be modeled by the equation y=0.85x+4.12.y=0.85x+4.12. Use this equation to answer the following; Special Note: I suggest you verify this equation by performing linear regression on your calculator. A) How many murders per 100,000 residents can be expected in a state with 10.9 thousand automatic weapons? Answer = Round to 3 decimal places. B) How many murders per 100,000 residents can be expected in a state with 8 thousand automatic weapons? Answer = Round to 3 decimal places. 10. The following table shows retail sales in drug stores in billions of dollars in the U.S. for years since 1995. Year 0 3 6 9 12 15 Retail Sales 85.851 108.426 141.781 169.256 202.297 222.266 Let S(t)S(t) be the retails sales in billions of dollars in t years since 1995. A linear model for the data is F(t)=9.44t+84.182F(t)=9.44t+84.182. Page 14 of 17 Use the above scatter plot to decide whether the linear model fits the data well. • The function is not a good model for the data • The function is a good model for the data. Estimate the retails sales in the U. S. in 2015. billions of dollars. Use the model to predict the year that corresponds to retails sales of $244 billion. 11. You wish to determine if there is a negative linear correlation between the age of a driver and the number of driver deaths. The following table represents the age of a driver and the number of driver deaths per 100,000. Use a significance level of 0.01 and round all values to 4 decimal places. Driver Age Number of Driver Deaths per 100,000 56 19 45 23 45 33 78 24 64 31 56 24 34 25 63 35 30 34 Ho: ρ = 0 Ha: ρ < 0 Find the Linear Correlation Coefficient r= Find the p-value p-value = The p-value is • Greater than αα • Less than (or equal to) αα The p-value leads to a decision to Page 15 of 17 • Do Not Reject Ho • Accept Ho • Reject Ho The conclusion is • There is a significant positive linear correlation between driver age and number of driver deaths. • There is a significant negative linear correlation between driver age and number of driver deaths. • There is insufficient evidence to make a conclusion about the linear correlation between driver age and number of driver deaths. • There is a significant linear correlation between driver age and number of driver deaths. 12. A biologist looked at the relationship between number of seeds a plant produces and the percent of those seeds that sprout. The results of the survey are shown below. Seeds Produced 63 59 69 56 66 65 60 57 Sprout Percent 45.5 55.5 41.5 58 40 43.5 44 45.5 a. Find the correlation coefficient: r=r= Round to 2 decimal places. b. The null and alternative hypotheses for correlation are: H0:H0: == 0 H1:H1: ≠≠ 0 The p-value is: (Round to four decimal places) c. Use a level of significance of α=0.05α=0.05 to state the conclusion of the hypothesis test in the context of the study. o There is statistically insignificant evidence to conclude that a plant that produces more seeds will have seeds with a lower sprout rate than a plant that produces fewer seeds. o There is statistically insignificant evidence to conclude that there is a correlation between the number of seeds that a plant produces and the percent of the seeds that sprout. Thus, the use of the regression line is not appropriate. Page 16 of 17 d. e. o There is statistically significant evidence to conclude that there is a correlation between the number of seeds that a plant produces and the percent of the seeds that sprout. Thus, the regression line is useful. o There is statistically significant evidence to conclude that a plant that produces more seeds will have seeds with a lower sprout rate than a plant that produces fewer seeds. r2r2 = (Round to two decimal places) Interpret r2r2 : o 56% of all plants produce seeds whose chance of sprouting is the average chance of sprouting. o There is a large variation in the percent of seeds that sprout, but if you only look at plants that produce a fixed number of seeds, this variation on average is reduced by 56%. o There is a 56% chance that the regression line will be a good predictor for the percent of seeds that sprout based on the number of seeds produced. o Given any group of plants that all produce the same number of seeds, 56% of all of these plants will produce seeds with the same chance of sprouting. f. The equation of the linear regression line is: ˆyy^ = + xx (Please show your answers to two decimal places) g. Use the model to predict the percent of seeds that sprout if the plant produces 58 seeds. Percent sprouting = (Please round your answer to the nearest whole number.) h. Interpret the slope of the regression line in the context of the question: o o o For every additional seed that a plant produces, the chance for each of the seeds to sprout tends to decrease by 1.05 percent. As x goes up, y goes down. The slope has no practical meaning since it makes no sense to look at the percent of the seeds that sprout since you cannot have a negative number. i. Interpret the y-intercept in the context of the question: Page 17 of 17 o The average sprouting percent is predicted to be 111.86. o If plant produces no seeds, then that plant's sprout rate will be 111.86. o The best prediction for a plant that has 0 seeds is 111.86 percent. o The y-intercept has no practical meaning for this study. 1. Determine whether the following is an example of a sampling error or a non sampling error. A sociologist surveyed 300 people about their level of anxiety on a scale of 1 to 100. Unfortunately, the person inputting the data into the computer accidentally transposed six of the numbers causing the statistics to have errors. • • Non Sampling Error Sampling Error 2. Suppose you want to estimate the percentage of videos on YouTube that are cat videos. It is impossible for you to watch all videos on YouTube so you use a random video picker to select 1000 videos for you. You find that 2% of these videos are cat videos. Determine which of the following is an observation, a variable, a sample statistic, or a population parameter. Whether or not a video is a cat video a/an • • • • observation sample statistic variable population parameter Purchase answer to see full attachment

  
error: Content is protected !!