## Baby Birth Weights and Mother Cocaine Use Study

A random sample of the birth weights of 186 babies has a mean of 3103g and a standard deviation of 696g (based on data from “Cognitive Outcomes of Preschool Children with Prenatal Cocaine Exposure,” by Singer et al., Journal of the American Medical Association, Vol. 291, No. 20). These babies were born to mothers who did not use cocaine during their pregnancies. Further, a random sample of the birth weights of 190 babies born to mothers who used cocaine during their pregnancies has a mean of 2700g and a standard deviation of 645g. Does cocaine use appear to affect the birth weight of a baby? Substantiate you conclusion.

## Standard Deviation, Average Number & P-Value Questions

1) The club professional at a difficult public course boasts that his course is so tough that the average golfer loses a dozen or more golf balls during a round of golf. A dubious golfer sets out to show that the pro is fibbing. He asks a random sample of 15 golfers who just completed their rounds to report the number of golf balls each lost. Assuming that the number of golf balls lost is normally distributed with a standard deviation of 3, can we infer at the 10% significance level that the average number of golf balls lost is less than 12?

1 14 8 15 17 10 12 6

14 21 15 9 11 4 4 8

2) A random sample of 12 second-year university students enrolled in a business statistics course was drawn. At the course’s completion, each student was asked how many hours he or she spent doing homework in statistics. The data are listed here. It is known that the population standard deviation is ? = 8.0. The instructor has recommended that students devote 3 hours per week for the duration of the 12-week semester, for a total of 36 hours. Test to determine whether there is evidence that the average student spent less than the recommended amount of time. Compute the p-value of the test.

31 40 26 30 36 38 29 40 38 30 35 38

3) Spam e-mail has become a serious and costly nuisance. An office manager believes that the average amount of time spent by office workers reading and deleting spam exceeds 25 minutes per day. To test this belief, he takes a random sample of 18 workers and measures the amount of time each spends reading and deleting spam. The results are listed here. If the population of times is normal with a standard deviation of 12 minutes, can the manager infer at the 1% significance level that he is correct?

35 48 29 44 17 21 32 28 34

23 13 9 11 30 42 37 43 48

## Hypothesis Testing – Milk Volume

Quart cartons of milk should contain at least 32 ounces. A sample of 22 cartons contained the following amounts in ounces.

31.5 32.2 31.9 31.8 31.7 32.1 31.5 31.6 32.4 31.6 31.8

32.2 32.1 32.1 31.6 32.0 31.6 31.7 32.0 31.5 31.9 32.8

a) What set of hypotheses should be tested if we want to demonstrate the mean amount of milk in all cartons of this brand is actually less than 32 ounces?

b) Select the distribution to use. Explain briefly why you selected it.

c) Assuming that they wish to test the claim at a = 0.025, determine the rejection and non rejection regions based on your hypotheses in a). State the critical value.

d) Calculate the value of the test statistic. What does the p-value mean for this problem? Explain.

e) Applying the hypothesis test, can we conclude that there is sufficient evidence to claim that the mean amount is less than 32 ounces in all cartons of this brand?

## Statistics Problem: Mean Wait Time

A bank manager has developed a new system to reduce the time customers spend waiting for teller service during peak hours. The manager hopes that the new system will reduce waiting times from the current 9 to 10 minutes to less than 6 minutes. Suppose that the manager wishes to use 100 waiting times to support the claim that the mean waiting time under the new system is shorter than six minutes. The random sample of 100 waiting times yields a sample mean of 5.46 minutes. Further, let’s assume that the population standard deviation is 2.475.

a) State the null and alternative hypotheses, letting u represent the mean waiting time under the new system.

b) Select the distribution to use. Explain briefly why you selected it.

c) Assuming that she wishes to test the claim at alpha = 0.05, determine the rejection and non-rejection regions based on your hypotheses in (a). State the critical value.

d) Calculate the value of the test statistic.

e) What do you conclude about whether the new system has reduced the mean waiting time to below six minutes? Explain your conclusion in words.

## SPSS Experimental Analysis – Diet of Rats

A researcher interested in studying the effects of three experimental diets with varying fat contents on the total lipid (fat) level in plasma collected the data in the file FatInDiet.txt.

Fifteen male subjects who were within 20 percent of their ideal body weight were grouped into five blocks according to age. Within each block the experimental diets were randomly assigned to the three subjects. The outcome measure was the total reduction in lipid level after the subjects were on the diet for a fixed period of time.

The columns of the data set correspond to (1) reduction in lipid level; (2) block (1=15-24 years; 2=25-34 years; 3=35-44 years; 4=45-54 years; and 5=55-64 years; and (3) fat content of diet (1=extremely low; 2=fairly low; and 3=moderately low).

Perform a complete analysis of the completely randomized block experiment to determine if fat content affects the mean reduction in lipid level of male subjects. As part of your analysis determine if there is evidence that the effect of diet varies by age of the subject. Report in your analysis confidence intervals comparing the difference in mean lipid reduction between all pairs of treatment conditions for the oldest group of patients in the study.

## Interpreting the P-Value

In 280 trials with a professional touch therapist, correct responses to a question were obtained 1223 times. The P-value of 0.979 is obtained when testing the claim that p > 0.5 (the proportion of correct responses is greater than the proportion of 0.5 that would be expected with random chance). What is the value of the sample proportion? Based on the P-value of 0.979, what should we conclude about the claim that p > 0.5?

a. Refer to Exercise 3 and distinguish between the value of p and the P-value

b. If the P is low the null must go. If the p is high, the null will fly. What does this mean?

## Using Statistics to Analyze a Scenario

Scenario: You are interested in researching a new ambulance dispatching method. The new method is supposed to be more efficient and uses half the communications personnel of the old method, saving over $300,000 annually. You want to know how the new method compares to the old method in terms of how many minutes it takes ambulances to get to dispatched calls, since you are not willing to switch if the new method increases the dispatch time significantly. For the research study, calls are randomly assigned to one dispatch method or the other. The Old and the New methods are then to be compared. Your level of significance is set at p=.05. You measure the dispatch time on the next 500 calls. At the completion of the study, you obtain the following results for the comparison of the two methods.

Results:

– The p value you obtain is p = .061.

– Mean dispatch time “Old” method=2.5 minutes.

– Mean dispatch time “New” method=2.8 minutes.

– Mean difference .3 minutes.

– 95% CI of difference -.5 to + 1.1 minutes.

Complete the following exercises based on this scenario. Use only 1-2 sentences to answer each question!

1. Write a potential null hypothesis for the study.

2. Write a potential non-directional alternative hypothesis for the study.

3. Write a potential directional alternative hypothesis for the study.

4. Do the two dispatch methods differ significantly? How do you know?

5. Based on your result in #4, which dispatch method should you use? Why?

6. If the true situation is that the two methods do not differ significantly (null is true) and your research shows that the “Old” method is significantly faster, (reject null), what type of error have you made? What are the potential consequences in this case?

7. If the true situation is that the “Old” method is significantly faster (null is false) and your research shows that the two methods do not differ significantly (accept null), what type of error have you made? What are the potential consequences in this case?

## Statistics Problem Set: Confidence Level

1) For the following sequence: 2,4,7,3,9,4,7,11 find the range, mode, median, mean, IQR, standard deviation, variance and whether it is normally distributed (with explanation)

2) An investor wants to assess at confidence level of 98% if a medicament can improve the marks of students. He notices that for the following daily dozes: 2,2.5,3,4.5,7 mg, the improvement over the students that studied the same but took no mental boosters were of: 3,4,4.6,6, 8.5 percent. At a confidence interval of 99%, what results would you expect for the students who take a daily dose of 6.5 mg? show full calculations.

3) A T 95 tank weighs 75 tons. A businessman wants to buy a couple of thousand tanks for hunting expedition in Texas. He will buy only if the tanks are really as specified and refuse to buy them if they statistically they weigh less. He takes a sample of 8 tanks and finds their average weight at 74.7 tons with a standard deviation of 0.3 tons. He wants to use a significance level of 2%. What will be his best course of action. Show all the work and justify your answer.

4) A car has a gas mileage of 500 miles per gallon of gasoline, with a standard deviation of 10 miles per gallon. What would be the gas mileage of the top 40% of such cars? What would be the mileage range for 70% of the cars? What %of cars will have a mileage of 530 miles per gallon of less?

5) The probability that a student answers a multiple question correctly is 27%, in an exam with 6 questions. What would be the probability the students answers correctly at random: no question, all questions, 6 questions, less than two and more than 4 questions? Answer individually all these questions, show all the work and explain your logic.

6) Mr. G. just won the jackpot with one ticket. He chose correctly 6 numbers out of 52 numbers and 2 stars our out of 12 stars. What was his probability of winning when he bought the ticket? Show full work and explain your answer.

## Conducting and Econometrics Analysis

An investigator analysing the relationship between food expenditure, disposable income and prices in the US using annual data over the period 1959-83 computes the following regression

log(FOOD) = 4.7377 + 0.1069TIME + 0.3506log(PDI) – 0.5086log(PRICE)

(0.6805) (0.0033) (0.0899) (0.1010)

FOOD Total household expenditure on food

TIME A time trend

PDI Personal disposable income

PRICE The price of food deflated by a general price index

Figures in parentheses are standard errors

(i) Give an economic interpretation of the coefficients on log(PDI) and log(PRICE)

(ii) Test the hypothesis (using a 5% significance level) that the coefficient of log(PRICE) is equal to zero against the alternative that it is nonzero.

(iii) Test the hypothesis (using a 5% significance level) that the coefficient of log(INCOME) is equal to 1 against the alternative that is significantly different from 1.

You are now given the following extra information

SST = sum(y_t – mean(Y))^2 = 0.53876

SSR = sum(e_t)^2 = 0.0046276

(iv) Compute the SSE and R^2 for the above regression

(v) Test the joint hypothesis (at the 5% level) that the three ‘slope’ coefficients are all equal to zero against the alternative that at least one ‘slope’ coefficient is non-zero.

## Statistics Problem: Bacteria in Carpeted Rooms

Researchers wanted to determine if carpeted rooms contained more bacteria than uncarpeted rooms. To determine the amount of bacteria in a room, researchers pumped the air from the room over a Petri dish at a rate of 1 cubic foot per minute for eight carpeted rooms and eight uncarpeted rooms. Colonies of bacteria were allowed to form in the 16 Petri dishes. The results were collected. A normal probability plot and box plot indicate the data are approximately normally distributed with no outliers. The data is as follows in bacteria per cubic foot:

Carpeted: 11.8, 10.8, 8.2, 10.1, 7.1, 14.6, 13.0, 14.0

Uncarpeted: 12.1, 12.0, 8.3, 11.1, 3.8, 10.1, 7.2, 13.7

Determine using the appropriate hypothesis testing technique if carpeted rooms have more bacteria than uncarpeted rooms at the .05 level of significance.

## Do actively managed funds fail to outperform the overall market?

It is believed that most actively managed funds fail to outperform the overall market. To empirically test this statement, an analyst has collected the following secondary data to examine whether or not the mean return of an index which represents all the actively managed funds in the market is statistically different from the average return of the overall market.

Indexes # Observations Average Annualized

Annualized Standard

Return Deviation

Active

Fund 101 12.10% 15.73%

Index

Benchmark

Market 101 4.60% 9.98%

Index

a. Conduct an appropriate test to examine whether the two samples have the same population variances at the 2% level of significance.

b. Assume that the two samples have the same variances; conduct an appropriate test to examine whether the two indexes have the same mean at the level of 0.01.

c. Conduct appropriate test to examine whether the annualized mean return for the active fund index is different from 9% at the 0.01 significance level.

d. Construct a 95% confidence interval for the mean monthly returns of benchmark market index.

e. Compare and contrast the use of test statistic/critical value approach and the p-value approach to condut a hypothesis test.

## Statistics Problem Set: Buena School District Bus

1. In a market test of a new chocolate raspberry coffee, a poll of 400 people from Dobbs Ferry showed 250 preferred the new coffee. In Irvington, 170 out of 350 people preferred the new coffee. To test the hypothesis that there is no difference in preferences between the two villages, what is the alternate hypothesis?

a. H1: p1 < p2

b. H1: p1 > p2

c. H1: p1 = p2

d. H1: p1 1p2

2. The regression equation is Y = 29.29 – 0.96X, the sample size is 8, and the standard error of the slope is 0.22. What is the test statistic to test the significance of the slope?

a. z = -4.364

b. z = 4.364

c. t = -4.364

d. t = -0.96

3. Which condition must be met to conduct a test for the difference in two sample means using a z-statistic?

a. Data must be at least of nominal scale

b. Populations must be normal

c. Standard deviations of the two populations must be known

d. Samples are dependent

4. What chart helps to identify the relatively few factors that impact the performance of a manufacturing or service process?

a. SPC

b. Pareto analysis

c. Fishbone chart analysis

d. Diagnostic chart

5. Using a 5% level of significance and a sample size of 25, what is the critical value for a one-tailed hypothesis test?

a. 1.708

b. 1.711

c. 2.060

d. 2.064

6. Assuming the population variances are known, the population variance of the difference between two sample means is

a. The sums of the two means

b. The sum of the variances for each population

c. The sum of the standard deviations for each population

d. The sum of the sample sizes for each population

7. Which of the following can be used to test the hypothesis that two nominal variables are related?

a. A contingency table

b. A chi-square table

c. An ANOVA table

d. A scatter diagram

Homework help:

Refer to Buena School District bus data.

a. Find the median maintenance cost and the median age of the buses. Organize the data into a two-by-two contingency table, with buses above and below the median of each variable. Determine whether the age of the bus is related to the amount of the maintenance cost. Use the .05 significance level.

b. Is there a relationship between the maintenance costs and the manufacturer of the bus? Use the breakdown in part (a) for the buses above and below the median maintenance costs and the bus manufacturers to create a contingency table. Use the .05 significance level.

c. Use statistical software and the .05 significance level to determine whether it is reasonable to assume that the distributions age of the bus, maintenance cost, and mile traveled last month follow a normal distribution.

## Statistics Problem Set: Buena School District Bus

1. In a market test of a new chocolate raspberry coffee, a poll of 400 people from Dobbs Ferry showed 250 preferred the new coffee. In Irvington, 170 out of 350 people preferred the new coffee. To test the hypothesis that there is no difference in preferences between the two villages, what is the alternate hypothesis?

a. H1: p1 < p2

b. H1: p1 > p2

c. H1: p1 = p2

d. H1: p1 1p2

2. The regression equation is Y = 29.29 – 0.96X, the sample size is 8, and the standard error of the slope is 0.22. What is the test statistic to test the significance of the slope?

a. z = -4.364

b. z = 4.364

c. t = -4.364

d. t = -0.96

3. Which condition must be met to conduct a test for the difference in two sample means using a z-statistic?

a. Data must be at least of nominal scale

b. Populations must be normal

c. Standard deviations of the two populations must be known

d. Samples are dependent

4. What chart helps to identify the relatively few factors that impact the performance of a manufacturing or service process?

a. SPC

b. Pareto analysis

c. Fishbone chart analysis

d. Diagnostic chart

5. Using a 5% level of significance and a sample size of 25, what is the critical value for a one-tailed hypothesis test?

a. 1.708

b. 1.711

c. 2.060

d. 2.064

6. Assuming the population variances are known, the population variance of the difference between two sample means is

a. The sums of the two means

b. The sum of the variances for each population

c. The sum of the standard deviations for each population

d. The sum of the sample sizes for each population

7. Which of the following can be used to test the hypothesis that two nominal variables are related?

a. A contingency table

b. A chi-square table

c. An ANOVA table

d. A scatter diagram

Homework help:

Refer to Buena School District bus data.

a. Find the median maintenance cost and the median age of the buses. Organize the data into a two-by-two contingency table, with buses above and below the median of each variable. Determine whether the age of the bus is related to the amount of the maintenance cost. Use the .05 significance level.

b. Is there a relationship between the maintenance costs and the manufacturer of the bus? Use the breakdown in part (a) for the buses above and below the median maintenance costs and the bus manufacturers to create a contingency table. Use the .05 significance level.

c. Use statistical software and the .05 significance level to determine whether it is reasonable to assume that the distributions age of the bus, maintenance cost, and mile traveled last month follow a normal distribution.

Attachments

001.jpg

002.jpg

Homework Help #2.docx

Solution Preview

1. In this question, the null hypothesis: p1=p2.

Then the alternative hypothesis: p1 is not equal to p2. In this case, only d is right.

2. Test statistic=-0.96/0.22=-4.3636, in addition, this is a t test. Therefore, c is right.

3. To be able to use z statistic rather than t statistic, the standard deviation of the two populations must be known.

4. Pareto analysis is a statistical technique in decision making that is used …

Solution Summary

The expert examines Buena School District Bus statistics problem sets. Regression equations are provided.

$2.19 Add Solution to Cart

Purchase Solution

$2.19 Add to Cart

Solution provided by:

Lei Shi, Ph.D.

About Expert

Search Solutions

Search

Legal Terms and Conditions

Privacy Policy

Copyright Notification Policy

## Hypothesis Testing and Assessing Levels of Significance

Perform the following hypothesis tests on the data in spreadsheet HW #1. State the conclusion of the hypothesis tests. The level of significance is 0.05 in each test.

1. The claim that the average Median SAT is over 1200.

2. The claim that the proportion of schools with Expenditures/Student over $50,000 is 0.2.

3. The claim that there is no difference in mean Acceptance Rate between University and Liberal Arts College.

4. The claim that the variance of Median SAT is less than 4000.

## Data analyses

Attached are data from 2 populations; the first population consists of subjects numbered Combined-POS-1 thri 156 that have been tested for CD4 % LYMPHOCYTES (pos) WBC (X1000/ul) (pos) HB (g/dl) (pos) PLATELETS (x1000/ul) (pos)

The second population consists of subjects numbered Combined-NEG-1

There are 642 that have also been tested for CD4 % LYMPHOCYTES (pos) WBC (X1000/ul) (pos) HB (g/dl) (pos) PLATELETS (x1000/ul) (pos).

The study objective is to determine if there is a statistically significant difference in either population for these tests.(i.e. is the CD4 result different between population 1 and population 2)

The t test, regressions, charts and correlations are the only tools allowed to make this determination; one is limited to using excel (data analysis) for the charts and analysis.