## Hypothesis Testing and Scatter Plots

I need help (step-by-step) with the following questions for my upcoming test. (please see attached). Please explain the process of figuring these questions out. I have an upcoming final and I need to know how to do these problems.

A random sample of eight quarterback listed in The Sports Encyclopedia: Pro Football, gave the following information.

Let x = height of a quarterback in inches, and let y = weight of a quarterback in pounds.

x 75 78 74 73 72 75 76 73

y 205 230 210 210 195 215 203 196

a) Draw a scatter plot.

b) Calculate the correlation coefficient

c) Make a conclusion based upon the correlation coefficient.

d) Calculate the equation of the regression line and draw the line in your scatter plot.

e) If a quarterback is x = 76 inches tall, what can you predict is the weight of this quarterback?

f) Calculate the coefficient of determination.

g) Interpret the coefficient of determination in terms of your data.

h) Calculate the stand error of the estimate.

i) Interpret the standard error of the estimate in terms of your data.

j) Construct a 95% prediction interval for the weight of the quarterback when the height is 70 inches and interpret the results.

## Predictors of Salary

Suppose you went to the Bureau of Labor Statistics web site and found several predictors of your salary. State three possible predictors and a simulated multilinear regression equation along with their p-values. What does your p-values tell you about your predictors?

## Hypothesis Tesing on Proportion on Eagle Outfitters Example

Eagle Outfitters is a chain of stores specializing in outdoor apparel and camping gear. It is considering a promotion that involves mailing discount coupons to all its credit card customers. This promotion will be considered a success if more than 10% of those receiving the coupons use them. Before going national with the promotion, coupons were sent to a

sample of 100 credit card customers.

a. Develop hypotheses that can be used to test whether the population proportion of those who will use the coupons is sufficient to go national.

b. Use a = 0.05 to conduct your hypothesis test. Should Eagle go national with the promotion?

## Hypothesis Testing for Mean Difference

A market research firm used a sample of individuals to SELFIII rate the purchase potential of a

particular product before and after the individuals saw a new television commercial about

the product. The purchase potential ratings were based on a 0 to 10 scale, with higher values

indicating a higher purchase potential. The null hypothesis stated that the mean rating

“after” would be less than or equal to the mean rating “before.” Rejection of this hypothesis

would show that the commercial improved the mean purchase potential rating. Use

a = .05 and the following data to test the hypothesis and comment on the value of the

commercial.

Purchas Rating Purchase Rating

Individual After Before Individual After Before

1 6 5 5 3 5

2 6 4 6 9 8

3 7 7 7 7 5

4 4 3 8 6 6

## Test a claim and STATDISK

Assume that head injury measurements are not affect by an interaction between type of car (foreign, domestic) and size of car (small, medium, large). Is there sufficient evidence to support the claim that size of the car has an effect on head injury measurements?

Use display 1 which results from the head injury measurements from car crash dummies. The measurements are in hic (head injury criterion) units, and they are from the same cars used in the display 2. Use a 0.05 significance level to test the given claim.

DISPLAY 1

See attached

DISPLAY 2

See attached

(2)

Assume that self-esteem measurements are not affected by an interaction between subject self-esteem and target self-esteem. Is there sufficient evidence to support the claim that the category of the target (low, high) has an effect on measures of self-esteem?

Use STATDISK display, which results from measures of self-esteem listed in the table below. The data is from Richard Lowry and is based on a student project at Vassar College supervised by Jannay Morrow. The objective of the project was to study how levels of self-esteem in subjects related to their perceived self-esteem in other target people who were described in writing. Self-esteem levels were measured using the Coopersmith Self-Esteem Inventory, and the test here works well even though the data is at the ordinal level of measurement. Use a 0.05 significance level to rest the given claim.

Subject’s Self-Esteem

## Webcredible

Webcredible, a UK-based consulting firm specializing in websites, intranets, mobile devices, and applications, conducted a survey of 1,132 mobile phone users between February and April 2009. The survey found that 52% of mobile phone users are now using the mobile Internet. (Data extracted from “Email and Social Networking Most Popular Mobile Internet Activities”). The authors of the article imply that the survey proves that more than half of all mobile phone users are now using the mobile Internet.

a. Use the five-step p-value approach to hypothesis testing and a 0.05 level of significance to try to prove that more than half of all mobile phone users are now using the mobile Internet.

b. Based on your result in (a), is the claim implied by the authors valid?

c. Suppose the survey found that 53% of mobile phone users are now using the mobile Internet. Repeat parts (a) and (b).

d. Compare the results of (b) and (c).

## Average salary of blue-collar workers

While negotiating a labor contract, the president of a company argues that the mean annual earnings of blue-collar workers is less than $56,000. The labor union argues that the salary is more than $56,000. A random sample of the annual earnings of 350 blue-collar workers is taken and contained in the file LaborDispute.xls.

a. Assume the population standard deviation is $22,000 and use it to test the following hypothesis at a .01 level of significance:

H0: μ ≤ $56,000

Ha: μ > $56,000

Do the test results support the company president or the union?

b. Repeat the above hypothesis test. However, this time use the sample standard deviation in place of the population standard deviation.

c. Does replacing the population standard deviation with the sample standard deviation affect the outcome of the hypothesis test?

## Hypothesis Testing for a Single Mean: Fat Content of Hamburgers

1. The fat content was determined for 40 hamburgers sold by a fast food restaurant. The results in grams appear below.

20.0 18.7 21.6 20.9 21.8 20.2 19.7 18.9 19.5 21.5

19.3 21.2 18.4 21.0 21.6 20.6 20.7 21.9 20.1 17.1

18.1 21.1 19.3 21.5 20.1 16.5 18.9 17.4 20.8 18.5

21.6 23.1 20.5 22.0 20.6 17.5 16.1 20.1 21.8 19.4

a. Find the sample mean and standard deviation for this data set. Put these values in the space below. Please be sure to label which value is which statistic.

b. In order to perform this hypothesis test, what assumption(s) must be met?

c. Are the assumption(s) met? Please provide evidence if necessary. If you create a plot, please make sure to include it below or attach it to the back of this packet.

d. The fast food restaurant claims that their hamburgers have 19.35 grams of fat or less. A consumer group claims that in fact, their fat content is more than 19.35 grams. What are the null and alternate hypotheses for this test? Why?

e. Calculate the test statistic or the p-value. Please be sure to show some work.

f. What is the critical value for this hypothesis test using the significance level associated with a 90% confidence interval?

g. State the decision for this test.

h. What conclusions can you draw about the fat content of the hamburgers?

2. Using the fat content data above, do the following.

a. Create a 90% confidence interval. Please be sure to show your work.

b. Interpret this interval.

c. Does this interval agree with your conclusions from number 1 above? Explain/support your answer.

3. Should the results from hypothesis tests always agree with the confidence interval at the same confidence level? Why or why not?

## Statistics Practice Problems

1) The health of employees is monitored by periodically weighing them in. A sample of 54 employees has a mean weight of 183.9 lb. Assuming that σ is known to be 121.2 lb, use a 0.10 significance level to test the claim that the population mean of all such employees weights is less than 200 lb. Identify the null hypothesis, alternative hypothesis, test statistic, P-value, conclusion about the null hypothesis, and final conclusion that addresses the original claim.

2) A researcher wants to check the claim that convicted burglars spend an average of 18.7 months in jail. She takes a random sample of 11 such cases from court files and finds that the sample mean = 20.5 months and s = 7.9 months. Test the null hypothesis that μ = 18.7 at the 0.05 significance level. Test the given claim using the traditional method of hypothesis testing. Assume that the sample has been randomly selected from a population with a normal distribution. Identify the null hypothesis, alternative hypothesis, test statistic, P-value, conclusion about the null hypothesis, and final conclusion that addresses the original claim.

3) A researcher wishes to determine whether people with high blood pressure can reduce their blood pressure by following a particular diet. Use the sample data below to test the claim that the treatment population’s mean μ1 is smaller than the control population’s mean μ2. Test the claim using a significance level of 0.01. Test the indicated claim about the means of two populations. Assume that the two samples are independent and that they have been randomly selected. Identify the null hypothesis, alternative hypothesis, test statistic, P-value, conclusion about the null hypothesis, and final conclusion that addresses the original claim.

(see attached file for data)

4) Use the sample data above to construct a 99% confidence interval for u1 – u2 where u1 and u2 represent the mean for the treatment group and the control group respectively. Interpret the confidence interval in the context of the study described in problem 3 above.

5) Five students took a math test before and after tutoring. Their scores were as follows.

Using a 0.01 level of significance, test the claim that the tutoring improves the math scores. Use the traditional method of hypothesis testing to test the given claim about the means of two populations. Assume that two dependent samples have been randomly selected from normally distributed populations. Identify the null hypothesis, alternative hypothesis, test statistic, P-value, conclusion about the null hypothesis, and final conclusion that addresses the original claim.

## Hypothesis Testing for Two Samples

I need help solving the following problems:

In this problem set you will get some practice performing hypothesis tests for two samples. If you use Statdisk to perform any portion of these analyses, please include the results, label them, and refer to them accordingly in your interpretations. Good luck and enjoy!

1. Ten infants are involved in a study to compare the effectiveness of two medications for the treatment of diaper rash. For each baby, two areas of approximately the same size and rash severity were selected and one area was treated with medication A and the other with medication B. The number of hours for the rash to disappear was recorded for each medication and each infant. The times appear below. The goal of the study is to determine whether there is enough evidence to conclude that a significant difference exists in the average time required to eliminate the rash using both hypothesis testing and a confidence interval.

A 46 50 46 51 43 45 47 48 46 48

B 43 49 48 47 40 40 47 48 46 48

1. What are the null and alternate hypotheses for this test? Why?

2. What is the critical value for this hypothesis test using a 1% significance level?

3. Calculate the test statistic and the p-value using a 1% significance level.

4. State the decision for this test.

5. Determine the confidence interval level that would be associated with this hypothesis test at the 1% significance level. Explain why the confidence interval level is appropriate for this hypothesis test at the 1% significance level.

6. Create a confidence interval at the confidence level associated with the hypothesis test above.

7. Interpret this confidence interval.

8. Is there sufficient evidence to conclude that a difference exists in the average time required to eliminate the rash at the 1% significance level? Explain this using the results from both the confidence interval and the hypothesis test.

## Some Applications of Chi-Square Test

I need to know which equation to use and detailed steps to answer the following question by hand.

The following data were collected in a clinical trial evaluating a new compound designed to improve wound healing in trauma patients. The new compound is compared against a placebo. After treatment for 5 days with the new compound or placebo the extent of wound healing is measured and the data are shown below.

Percent Wound Healing

Treatment 0-25% 26-50% 51-75% 76-100%

New Compound (n=125) 15 37 32 41

Placebo (n=125) 36 45 34 10

Is there a difference in the extent of wound healing by treatment? (Hint: Are treatment and the percent wound healing independent?) Run the appropriate test at a 5% level of significance.

df= _____

alpha= ________

Reject H0 if χ2> ________

Z =

Compute expected frequencies

Expected Frequency = (Row Total * Column Total)/N.

Complete the expected frequencies for each cell.

Percent Wound Healing Total

Treatment 0-25% 26-50% 51-75% 76-100%

New Compound 15

( ) 37

( ) 32

( ) 41

( ) 125

Placebo 36

( ) 45

( ) 34

( ) 10

( ) 125

Total 51 82 66 51 250

Compute χ2.

χ2= ________

Based on comparing the computed χ2 to the rejection level χ2 which of the following is (are) true?

A. We have statistically significant evidence at alpha=0.05 to show that H0 is false.

B. Treatment and percent wound healing are not independent.

C. Treatment and percent wound healing are independent.

D. a and b.

## Various Hypothesis Testing Problems

Diabetes

The insulin pump is a device that delivers insulin to a diabetic patient at regular intervals. It presumably regulates insulin better than standard injections. However, data to establish this point are few, especially in children. The following study was set up to assess the effect of use of the insulin pump on HgbA1c, which is a long-term marker of compliance with insulin protocols. In general, a normal range for HgbA1c is < 7%. Data were collected on 256 diabetic patients for 1 year before and after using the insulin pump. A subset of the data for 10 diabetic patients is given in Table 8.40 (see attachment).

8.163 What test can be used to compare the mean HgbA1c 1 year before vs. mean HgbA1c 1 year after use of the insulin pump?

8.164 Perform the test in Problem 8.163, and report a two-tailed p- value.

8.165 Provide a 95% CI for the mean difference in HgbA1c before minus the mean HgbA1c after use of the insulin pump.

Cardiology

Example31: Cardiovascular Disease The Physicians’ Health Study was a randomized clinical trial, one goal of which was to assess the effect of aspirin in preventing myocardial infarction (MI). Participants were 22,000 male physicians ages 40- 84 and free of cardiovascular disease in 1982. The physicians were randomized to either active aspirin (one white pill containing 325 mg of aspirin taken every other day) or aspirin placebo (one white placebo pill taken every other day). As the study progressed, it was estimated from self- report that 10% of the participants in the aspirin group were not complying (that is, were not taking their study [aspirin] capsules). Thus the dropout rate was 10%. Also, it was estimated from self- report that 5% of the participants in the placebo group were taking aspirin regularly on their own outside the study protocol. Thus the drop- in rate was 5%. The issue is: How does this lack of compliance affect the sample size and power estimates for the study?

(No need to answer anything here. Ex 31 is provided because Ex 32 refers to it)

Example 32: Refer to Example 10.31. Suppose we assume that the incidence of MI is .005 per year among participants who actually take placebo and that aspirin prevents 20% of MIs (i. e., relative risk = p 1 / p 2 = 0.8). We also assume that the duration of the study is 5 years and that the dropout rate in the aspirin group = 10% and the drop- in rate in the placebo group = 5%. How many participants need to be enrolled in each group to achieve 80% power using a two- sided test with significance level = .05?

(No need to answer anything here. Ex 32 is provided because problem 10.1 refers to it)

Consider the Physicians’ Health Study data presented in Example 10.32.

10.1 How many participants need to be enrolled in each group to have a 90% chance of detecting a significant difference using a two- sided test with a = .05 if compliance is perfect?

Pulmonary Disease

One important aspect of medical diagnosis is its reproducibility. Suppose that two different doctors examine 100 patients for dyspnea in a respiratory disease clinic and that doctor A diagnosed 15 patients as having dyspnea, doctor B diagnosed 10 patients as having dyspnea, and both doctor A and doctor B diagnosed 7 patients as having dyspnea.

10.16 Compute the Kappa statistic and its standard error regarding reproducibility of the diagnosis of dyspnea in this clinic.

Hepatic Disease

Refer to Data Set HORMONE. DAT, provided as an attachment.

10.37 What test procedure can be used to compare the percentage of hens whose pancreatic secretions increased (post- pre) among the five treatment regimens?

10.38 Implement the test procedure in Problem 10.37, and report a p-value.

Cancer

A topic of current interest is whether abortion is a risk factor for breast cancer. One issue is whether women who have had abortions are comparable to women who have not had abortions in terms of other breast cancer risk factors. One of the best- known breast cancer risk factors is parity (i. e., number of children), with parous women with many children having about a 30% lower risk of breast cancer than nulliparous women (i. e., women with no children). Hence it is important to assess whether the parity distribution of women with and without previous abortions is comparable. The data in Table 10.32 (see attachment) were obtained from the Nurses’ Health Study on this issue.

10.61 What test can be performed to compare the parity distribution of women with and without induced abortions?

10.62 Implement the test in Problem 10.61, and report a two- tailed p- value.

Ophthalmology

A 5-year study among 601 participants with retinitis pigmentosa assessed the effects of high- dose vitamin A (15,000 IU per day) and vitamin E (400 IU per day) on the course of their disease. One issue is to what extent supplementation with vitamin A affected their serum- retinol levels. The serum- retinol data in Table 10.33 (see attachment) were obtained over 3 years of follow- up among 73 males taking 15,000 IU/ day of vitamin A (vitamin A group) and among 57 males taking 75 IU/ day of vitamin A (the trace group; this is a negligible amount compared with usual dietary intake of 3000 IU/ day).

10.64 What test can be used to assess whether mean serum retinol has increased over 3 years among subjects in the vitamin A group?

One interesting aspect of the study described in Problem 10.64 is to assess changes in other parameters as a result of supplementation with vitamin A. One quantity of interest is the level of serum triglycerides. Researchers found that among 133 participants in the vitamin A group (males and females combined) who were in the normal range at baseline (< 2.13 µmol/ L), 15 were above the upper limit of normal at each of their last 2 consecutive study visits. Similarly, among 138 participants in the trace group who were in the normal range at baseline (< 2.13 µmol/ L), 2 were above the upper limit of normal at each of their last two consecutive study visits.

10.68 What test can be performed to compare the percent-age of participants who developed abnormal triglyceride levels between the vitamin A group and the trace group?

10.69 Implement the test in Problem 10.68, and report a two- tailed p- value.

Obstetrics

The standard screening test for Down’s syndrome is based on a combination of maternal age and the level of serum alpha- fetoprotein. Using this test, 80% of Down’s syndrome cases can be identified, while 5% of normals are detected as positive.

10.121 What is the sensitivity and specificity of the test? Suppose that 1 out of 500 infants are born with Down’s syndrome.

## Testing the Null Hypothesis and Interpreting the P-Value

Southside Hospital in Bay Shore, New York,

commonly conducts stress tests to study the heart

muscle after a person has a heart attack. Members of the diagnostic

imaging department conducted a quality improvement

project with the objective of reducing the turnaround time for

stress tests. Turnaround time is defined as the time from when

a test is ordered to when the radiologist signs off on the test

results. Initially, the mean turnaround time for a stress test was

68 hours. After incorporating changes into the stress-test

process, the quality improvement team collected a sample of

50 turnaround times. In this sample, the mean turnaround

time was 32 hours, with a standard deviation of 9 hours.

a. If you test the null hypothesis at the 0.01 level of significance,

is there evidence that the new process has reduced

turnaround time?

b. Interpret the meaning of the p-value in this problem.

## Designing a Hypothesis Test

How do you design a test pilot and what would it look like once finished? The situation I have involves a local clinical practice for individuals who show signs of depression. The doctor has noticed that some of her clients do better when they participate in group interventions, and others make more progress when she sees them individually. She has developed a 20-minute intake interview that can be conducted by a graduate assistant or caseworker in her office. The purpose of this interview is to notice and identify the type of treatment (group or individual) that is likely to work best for each client.

Do I need research software such as ANOVA or something of the sort? Do I run those numbers and develop the pilot test?

## Six Sigma Tools for Testing Statistical Significance

“Suppose you want to know if a new design of a product is actually better than the current product. For example, your design team is working on increasing the speed of the KX Speed Drill. You have produced a small batch of the new design, the KX2, and you want to know if this speed is faster than the current speed of 17.5 revs per second.

To test the new design, QA has taken a sample of 13 KX2’s and clocked their speed. The results of this test are shown below.

Is it safe to conclude that the new design for the KX2 has a significantly higher speed with a confidence of 1%?”

DATA

1 –18.4

2 –19.3

3 –20.5

4 –18.6

5 –17.8

6 –21.0

7– 19.4

8 –19.2

9 –18.9

10 –15.9

11 –17.7

12 –20.5

13 –19.1