# statistics

1. The U.S. Census Bureau needs to estimate the median income of females in the U.S. They collect incomes from 3500 females. Choose the best answer. Justify for full credit. (a) Which of the followings is the variable?

(i) Female in the US (ii) Income of a female in the US (iii) Set of income responses from all females in the US (iv) Median income of set of all females in the US

(b) Which of the followings is the parameter?

(i) Female in the US (ii) Income of a female in the US (iii) Set of income responses from all females in the US (iv) Median income of set of all females in the US

2. Choose the best answer. Justify for full credit.

(a) The hotel ratings are usually on a scale from 0 star to 5 stars. The level of this measurement is

(i) interval (ii) nominal (iii) ordinal (iv) ratio

(b) In a career readiness research, 100 students were randomly selected from the psychology program, 150 students were randomly selected from the communications program, and 120 students were randomly selected from cyber security program. This type of sampling is called:

(i) cluster (ii) convenience (iii) systematic (iv) stratified

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 3 of 10

3. The midterm exam scores in a statistics class are shown in the following table:

67 92 76 47 85 70 87 76 67 72 84 85 65 82 84 98 81 85 87 83

(a) Complete the following frequency distribution table using 6 classes: 40-49, 50-59, 60-69, 70-79, 80-89, and 90-99. Express the cumulative relative frequency to two decimal places. (Show all work. Just the answer, without supporting work, will receive no credit.)

Scores Frequency

Relative Frequency

Cumulative Relative Frequency 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 90 – 99

(b) What percentage of the midterm exam scores was at least 70?

4. Answer the following questions based on the midterm exam score data given in Question # 3: (Show all work. Just the answer, without supporting work, will receive no credit.)

(a) What is the range of the midterm exam scores? (b) What is the median of the midterm exam scores? (c) What is the mode of the midterm exam scores?

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 4 of 10

5. A STAT 200 professor took a sample of 10 midterm exam scores from a class of 30 students. The 10 scores are shown in the table below:

95 67 76 47 85 70 87 80 67 72

(a) What is the sample mean? (b) What is the sample standard deviation? (Round your answer to two decimal places) (c) If you leveraged technology to get the answers for part (a) and/or part (b), what technology did you use? If an online applet was used, please list the URL, and describe the steps. If a calculator or Excel was used, please write out the function.

6. There are 4 suits (heart, diamond, clover, and spade) in a 52-card deck, and each suit has 13 cards. Suppose your experiment is to draw one card from a deck and observe what suit it is. Express the probability in fraction format. (Show all work. Just the answer, without supporting work, will receive no credit.)

(a) Find the probability of drawing a heart or diamond. (b) Find the probability that the card is not a spade.

7. There are 2 white balls and 8 red balls in an urn. Consider selecting one ball at a time from the urn. What is the probability that the first ball is red and the second ball is also red? Express the probability in fraction format. (Show all work. Just the answer, without supporting work, will receive no credit.)

(a) Assuming the ball selection is with replacement. (b) Assuming the ball selection is without replacement.

8. There are twenty stores for a grocery chain in the Mid-Atlantic region. The regional executive wants to visit five of the twenty stores. She asks her assistant to choose five stores and arrange the visit schedule. (Show all work. Just the answer, without supporting work, will receive no credit).

(a) Does the order matter in the scheduling? (b) Based on your answer to part (a), should you use permutation or combination to find the different schedules that the assistant may arrange? (c) How many different schedules can the assistant recommend?

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 5 of 10

9. Mimi has seven books from the Statistics is Fun series. She plans on bringing two of the seven books with her in a road trip. (Show all work. Just the answer, without supporting work, will receive no credit).

(a) Does the order matter in the book selection? (b) Based on your answer to part (a), should you use permutation or combination to find the number of the different ways the two books can be selected? (c) How many different ways can the two books be selected?

10. Let random variable x represent the number of heads when a fair coin is tossed two times.

(a) Construct a table describing the probability distribution.

x P(x) 0 1 2

(b) Determine the mean and standard deviation of x. Show all work. Just the answer, without supporting work, will receive no credit.

11. Mimi plans make a random guess at 10 true-or-false questions. Answer the following questions:

(a) Let X be the number of correct answers Mimi gets. As we know, the distribution of X is a binomial probability distribution. What is the number of trials (n), probability of successes (p) and probability of failures (q), respectively? (b) Find the probability that she gets at most 5 correct answers. (Round the answer to 3 decimal places. (c) To get the answers for part (b), what technology did you use? If an online applet was used, list the URL and describe the steps. If a calculator or Excel was used, write out the function.

12. The heights of pecan trees are normally distributed with a mean of 10 feet and a standard deviation of 2 feet. Show all work. Just the answer, without supporting work, will receive no credit.

(a) What is the probability that a randomly selected pecan tree is between 9 and 12 feet tall? (round the answer to 4 decimal places) (b) Find the 75th percentile of the pecan tree height distribution. (round the answer to 2 decimal places) (c) To get the answers for part (a) and part (b), what technology did you use? If an online applet was used, list the URL and describe the steps. If a calculator or Excel was used, write out the function

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 6 of 10

13. Based on the performance of all individuals who tested between July 1, 2014 and June 30, 2017, the GRE Verbal Reasoning scores are normally distributed with a mean of 150.05 and a standard deviation of 8.43. (https://www.ets.org/s/gre/pdf/gre_guide_table1a.pdf). Show all work. Just the answer, without supporting work, will receive no credit. (a) Consider all random samples of 36 test scores. What is the standard deviation of the sample means? (Round your answer to three decimal places) (b) What is the probability that 36 randomly selected test scores will have a mean test score that is between 150 and 155? (Round your answer to four decimal places) (c) To get the answer for part (b), what technology did you use? If an online applet was used, list the URL and describe the steps. If a calculator or Excel was used, write out the function

14. A survey showed that 1200 of the 1600 adult respondents believe in global warming.

(a) Construct a 95% confidence interval estimate of the proportion of adults believing in global warming. Show all work. Just the answer, without supporting work, will receive no credit. Include description of how confidence interval was constructed. (b) Describe the confidence interval in everyday language.

15. A city built a new parking garage in a business district. For a random sample of 100 days, daily fees collected averaged $2,000, with a standard deviation of $500.

(a) Construct a 90% confidence interval estimate of the mean daily income this parking garage generates. Show all work. Just the answer, without supporting work, will receive no credit. Include description of how confidence interval was constructed. (b) Describe the confidence interval in everyday language.

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 7 of 10

16. A researcher claims the proportion of auto accidents that involve teenage drivers is greater than 10%. ABC Insurance Company checks police records on 200 randomly selected auto accidents and notes that teenagers were at the wheel in 25 of them. Assume the company wants to use a 0.10 significance level to test the researcher’s claim.

(a) What is the appropriate hypothesis test to use for this analysis: one-sample z-test for the population proportion, one-sample t-test for population proportion, one-sample z-test for population mean, or one-sample t- test for population mean? Please identify and explain why it is appropriate. (b) Identify the null hypothesis and the alternative hypothesis. (c) Determine the test statistic. Round your answer to two decimal places. Show all work; writing the correct test statistic, without supporting work, will receive no credit. (d) Determine the P-value for this test. Round your answer to three decimal places. Show all work; writing the correct P-value, without supporting work, will receive no credit. (e) Compare p-value and significance level α. What decision should be made regarding the null hypothesis (e.g., reject or fail to reject) and why? (f) Is there sufficient evidence to support the researcher’s claim that the proportion of auto accidents that involve teenage drivers is greater than 10%? Explain.

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 8 of 10

17. Mimi was curious if regular excise really helps weight loss, hence she decided to perform a hypothesis test. A random sample of 5 UMUC students was chosen. The students took a 30minute exercise every day for 6 months. The weight was recorded for each individual before and after the exercise regimen. Does the data below suggest that the regular exercise helps weight loss? Assume Mimi wants to use a 0.05 significance level to test the claim.

Weight (pounds) Subject Before After 1 190 180 2 170 160 3 185 190 4 160 160 5 200 190

(a) What is the appropriate hypothesis test to use for this analysis: z-test for two proportions, t-test for two proportions, t-test for two dependent samples (matched pairs), or t-test for two independent samples? Please identify and explain why it is appropriate. (b) Let μ1 = mean weight before the exercise regime. Let μ2 = mean weight after the exercise regime. Which of the following statements correctly defines the null hypothesis?

(i) μ1 – μ2 > 0 (μd > 0) (ii) μ1 – μ2 = 0 (μd = 0) (iii) μ1 – μ2 < 0 (μd < 0) (c) Let μ1 = mean weight before the exercise regime. Let μ2 = mean weight after the exercise regime. Which of the following statements correctly defines the alternative hypothesis? (a) μ1 – μ2 > 0 (μd > 0) (b) μ1 – μ2 = 0 (μd = 0) (c) μ1 – μ2 < 0 (μd < 0)

(d) Determine the test statistic. Round your answer to three decimal places. Show all work; writing the correct test statistic, without supporting work, will receive no credit. (e) Determine the p-value. Round your answer to three decimal places. Show all work; writing the correct critical value, without supporting work, will receive no credit. (f) Compare p-value and significance level α. What decision should be made regarding the null hypothesis (e.g., reject or fail to reject) and why? (g) Is there sufficient evidence to support the claim that regular exercise helps weight loss? Justify your conclusion.

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 9 of 10

18. The UMUC Daily News reported that the color distribution for plain M&M’s was: 40% brown, 20% yellow, 20% orange, 10% green, and 10% tan. Each piece of candy in a random sample of 100 plain M&M’s was classified according to color, and the results are listed below. Use a 0.05 significance level to test the claim that the published color distribution is correct. Show all work and justify your answer. Color Brown Yellow Orange Green Tan Number 42 18 15 7 18

(a) What is the appropriate hypothesis test: z-test for sample proportion, t-test for sample mean, chisquare goodness of fit test, F-test for ANOVA? Please identify and explain why it is appropriate for analyzing this data. (b) Identify the null hypothesis and the alternative hypothesis. (c) Determine the test statistic. Round your answer to two decimal places. Show all work; writing the correct test statistic, without supporting work, will receive no credit. (d) Determine the P-value. Round your answer to two decimal places. Show all work; writing the correct P-value, without supporting work, will receive no credit. (e) Compare p-value and significance level α. What decision should be made regarding the null hypothesis (e.g., reject or fail to reject) and why? (f) Is there sufficient evidence to support the claim that the published color distribution is correct? Justify your answer.

19. A STAT 200 instructor believes that the average quiz score is a good predictor of final exam score. A random sample of 10 students produced the following data where x is the average quiz score and y is the final exam score.

x 80 95 50 60 100 55 85 70 75 85 y 70 96 50 63 96 60 83 60 77 87

(a) Find an equation of the least squares regression line. Round the slope and y-intercept value to two decimal places. Describe method for obtaining results. Show all work; writing the correct equation, without supporting work, will receive no credit. (b) Based on the equation from part (a), what is the predicted final exam score if the average quiz score is 65? Show all work and justify your answer. (c) Based on the equation from part (a), what is the predicted final exam score if the average quiz score is 40? Show all work and justify your answer. (d) Which predicted final exam score that you calculated for (b) and (c) do you think is closer to the true final exam score and why?

STAT 200: Introduction to Statistics Final Examination, Fall 2018 OL1/US1 Page 10 of 10

20. What is the appropriate statistical analysis to use: t-test for two independent samples, t-test for dependent samples, ANOVA, or chi-square test of independence? Please identify and explain why it is appropriate.

(a) A study was conducted to see whether monetary incentives to use less water during times of drought had an effect on water usage. Sixty single family homeowners were randomly assigned to one of two groups: 1) monetary incentives and 2) no monetary incentives. At the end of three months, the total amount of water usage for each household, in gallons, was measured. (b) A study was conducted to see whether the mean weight loss is the same for 10 different weight loss programs. Each of the 10 programs had 50 subjects in it. The subjects were followed for 12 months. Weight change for each subject was recorded.