## A Fictitious Statistical Study

For this activity you will undertake an analysis based on a self-designed fictitious study that utilizes statistical methodologies. You will first develop a fictitious problem to examine, it can be anything. For example, maybe you want to look at whether scores on a standardized college placement test (like the SAT) are related to the level of income a person makes 10 years after college; Or, whether those who participate in a Leadership Training program rated as better managers compared to those who do not; Or, whether ones political affiliation is related to gender. These are just a few examples; be creative and think about what piques your interest. You might also address a problem that you may want to look at in future research for a dissertation. You will use either EXCEL or SPSS to conduct the analysis.

Your analysis report should include the following components:

1. Describe your research study.

2. State a hypothesis.

3. List and explain the variables you would collect in this study. There must be a minimum of three variables and two must meet the assumptions for a correlational analysis.

4. Create a fictitious data set that you will analyze. The data should have a minimum of 30 cases, but not more than 50 cases.

5. Conduct a descriptive data analysis that includes the following:

a) a measure of central tendency

b) a measure of dispersion

c) at least one graph

6. Briefly interpret the descriptive data analysis.

7. Conduct the appropriate statistical test that will answer your hypothesis. It must be a statistical test covered in this course such as regression analysis, single t-test, independent t-test, cross-tabulations, Chi-square, or One-Way ANOVA. Explain your justification for using the test based on the type of data and the level of measurement that the data lends to for the statistical analysis.

8. Report and interpret your findings. Use APA style and include a statement about whether you reject or fail to reject the null hypothesis.

9. Copy and paste your Excel or SPSS data output and place it in an appendix.

Remember, the goal of this project is to show what you have learned in the course. Therefore, this project becomes a cumulative learning project where you can demonstrate what you have learned through all the previous assignments, readings and video presentations that you have watched.

## Identify null hypothesis and alternative hypothesis

A skeptical paranormal researcher claims that the proportion of Americans that have seen a UFO is less than 1 in every one thousand. State the null hypothesis and the alternative hypothesis for a test of significance.

## Test Hypothesis for Population Proportions

Please help with the following problem.

I need to test the hypothesis (? = 0.05) that the population proportions of red and brown are equal (pred = pbrown). You are testing if their proportions are equal to one another, NOT if they are equal to one another AND equal to 13%. NOTE: These are NOT independent samples, but we will use this approach anyway to practice the method. This also means that n1 and n2 will both be the total number of candies in all the bags. The “x” values for red and brown are the counts of each we found on the Data page. You will need to calculate the weighted p:

Be sure to state clear hypotheses, test statistic, critical value or p-value, decision (reject/fail to reject), and conclusion in English. Submit your answer as a Word, Excel, .rtf or .pdf format through the M&M® project link in the weekly course content.

## a collection of statistics problems

1. A taxi company manager is trying to decide whether the use of radial tires instead of regular belted tires improves fuel economy. Twelve cars were equipped with radial tires and driven over a prescribed test course. Without changing drivers, the same cars were then equipped with regular belted tires and driven once again over the test course. The gasoline consumption, in kilometers per liter, is shown in the table below.

Kilometers per Liter

Car Radial Tires Belted Tires

1 4.2 4.1

2 4.7 4.9

3 6.6 6.2

4 7.0 6.9

5 6.7 6.8

6 4.5 4.4

7 5.7 5.7

8 6.0 5.8

9 7.4 6.9

10 4.9 4.7

11 6.1 6.0

12 5.2 4.9

a. Assuming that distribution of differences in kilometers per liter is approximately normal, can we conclude that cars equipped with radial tires give better fuel economy than those equipped with belted tires? Use a .05 level of significance.

Complete the following:

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

b. Construct a 95% confidence interval estimate of the difference in kilometers per liter. Do the results of parts (a) and (b) agree? Explain why or why not.

c. Suppose you use a .01 level of significance instead of a .05 level. Without doing the problem again, would the result be different from that in part (a). Explain your answer.

d. Is it important that the same cars were driven with both radial and belted tires, that drivers didn’t change when the belted tires replaced the radial tires, and that the same course was used in all tests? Explain why or why not.

2. Most air travelers now use e-tickets. Electronic ticketing allows passengers to not worry about a paper ticket, and it costs the airline companies less to handle than paper ticketing. However, in recent times, the airlines have received complaints from passengers regarding their e-tickets, particularly when connecting flights and a change of airlines were involved. To investigate the problem an independent watchdog agency contacted a random sample of 20 airports and collected information on the number of complaints the airport had with e-tickets during the month of March. The information is shown in the table below.

14 14 16 12 12 14 13 16 15 14

12 15 15 14 13 13 12 13 10 13

a. Assuming that the data are approximately normally distributed, is there sufficient evidence for the watchdog agency to conclude that the mean number of complaints per airport is less than 15 per month? Use a .05 level of significance.

Complete the following:

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

b. Is the normality assumption in part a necessary? Explain your answer.

c. Using a graphical approach discussed in the course, determine whether or not the assumption of normality appears to be valid. Show your graph and explain your answer.

3. A recent insurance industry report claimed that 40 percent of those persons involved in minor traffic accidents this year have been involved in at least one other traffic accident in the last five years. An advisory group decided to investigate this claim, believing it was too large. A sample of 200 traffic accidents this year showed 74 persons were also involved in another accident within the last five years.

a. At the .01 level of significance is there evidence that the advisory group is correct?

Complete the following:

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

b. Is it appropriate to use the z-statistic for this test? Explain your answer.

c. Explain the meaning of the p-value in this problem.

4. The Damon family owns a large grape vineyard in western New York along Lake Erie. The grapevines must be sprayed at the beginning of the growing season to protect against various insects and diseases. Two new insecticides have been marketed, Pernod 5 and Action. To test their effectiveness, three long rows were selected and sprayed with Pernod 5, and three others were sprayed with Action. When the grapes ripened, 400 of the vines treated with Pernod 5 were checked for infestation. Likewise, a sample of 400 vines sprayed with Action was checked. The results are as follows:

Insecticide Number of Vines Checked

(Sample Size) Percent of

Infected Vines

Pernod 5 400 6%

Action 400 9%

a. At the .05 level of significance, can we conclude that there is a difference in the proportion of vines infested using Pernod 5 as opposed to Action?

Complete the following:

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

b. Suppose the sample sizes were both 600 instead of 400 and the percentages of infected plants in the samples remain the same for each type spray. Does that change the conclusion you reached in part (a)? How?

c. Discuss the effect that sample size had on the outcome of this analysis and, in general, on the effect sample size plays in hypothesis-testing.

5. The manufacturer of an MP3 player wanted to know whether a 10 percent reduction in price is enough to increase the sales of the product. To investigate, the owner randomly selected eight outlets and sold the MP3 player at the reduced price. At seven randomly selected outlets, the MP3 player was sold at the regular price. Reported below is the number of units sold last month at the sampled outlets.

Regular Price Reduced Price

138 128

121 134

88 152

115 135

141 114

125 106

96 112

120

a. At the .01 level of significance, can the manufacturer conclude that the price reduction resulted in an increase in sales? Assume that the populations are approximately normally distributed.

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

b. Construct separate 99% confidence interval estimates of the mean number of sales at regular price and at the reduced price.

c. Do the results of part (b) agree with the results of part (a)? Explain why or why not.

6. A student team in a business statistics course conducted an experiment to test the download times of the three different types of computers (Mac, iMac, and Dell) available at the university library. The students randomly selected one computer of each type. The students went to the Microsoft game Web site and clicked on the download link for the NBA game. The time (in seconds) between clicking on the link and the completion of the download was recorded. After each download, the file was deleted, and the trash folder was emptied. A total of 30 downloads were completed in random order. The results are shown below. NOTE: You can copy and paste the data into Excel/PHStat.

Mac iMac Dell

156 160 236

166 165 238

148 184 257

160 192 242

139 197 282

151 172 253

158 189 270

167 179 256

142 200 267

219 193 259

a. One assumption of ANOVA is that the data are approximately normally distributed. Do the download times for each type computer appear to be approximately normally distributed? Support your answer with appropriate calculations or graphs.

b. Another assumption of ANOVA is that the variances of the populations are equal. At the .05 level of significance, is there evidence of a difference in the variations in the download times for the three types of computers?

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

c. At the .05 level of significance, is there evidence of a difference in the mean download times for the three computers?

1. State H0.

2. State H1.

3. State the value of ?.

4. State the value of the test statistic.

5. State the p-value.

6. State the decision in terms of H0 and why.

7. State the decision in terms of the problem.

d. If appropriate, use the Tukey-Kramer procedure to determine which download times differ significantly.

e. Based on the above, which computer should be chosen if you are interested in the shortest download time?

## Solving Hypothesis Testing Problems

1. Examine the given statement:

The proportion of people aged 18-25 who currently use illicit drugs is equal to 0.20 (or 20%).

Now, express the null hypothesis H0 and alternative hypothesis H1 in symbolic form. Be sure to use the correct symbols µ, p, and for the indicated parameter.

2. Refer to the following data:

Two-tailed test; 0.10

Assume that the normal distribution applies and finds the critical z values.

3. Use the given information below to find the P-value.

The test statistic is the right-tailed test is z=2.50.

Also, use a 0.05 significance level and state the conclusion about the null hypothesis (reject the null hypothesis or fail to reject the null hypothesis).

4. Test the given claim below:

A simple random sample of 50 adults is obtained, and each person’s red blood cell count (in cells per micro-liter) is measured. The sample mean is 5.23. The population standard deviation for red blood cell counts is 0.54. Use a 0.01 significance level to test the claim that the sample is from a population with a mean less than 5.4, which is a value often used for the upper limit of the range of normal values. What do the results suggest about the sample group?

Identify the null hypothesis, alternative hypothesis, test statistic, P-value or critical value(s), a conclusion about the null hypothesis, and the final conclusion that addresses the original claim. Use the P-value method unless your instructor specifies otherwise.

## comparison for three methods

A student of the author surveyed her friends and found that among 20 males, 4 smoke and among 30 female friends, 6 smoke. Give two reasons why these results should not be used for a hypothesis test of the claim that the proportions of male smokers and female smokers are equal.

Given a simple random sample of men and a simple random sample of women, we want to use a 0.05 significance level to test the claim that the percentage of men who smoke is equal to the percentage of women who smoke. One approach is to use the P-value method of hypothesis testing; a second approach is to use the traditional method of hypothesis testing; and a third approach is to base the conclusion on the 95% confidence interval estimate of p1â?”p2. Will all three approaches always result in the same conclusion? Explain.

## hypothesis test with dependent t test

1. Refer to the sample data given below:

The mean tar content of a simple random sample of 25 unfiltered king-size cigarettes is 21.1 mg, with a standard deviation of 3.2 mg. The mean tar content of a simple random sample of 25 filtered 100 mm cigarettes is 13.2 mg with a standard deviation of 3.7 mg.

Assume that the two samples are independent simple random samples, selected from normally distributed populations. Do not assume that the population standard deviations are equal, unless your instructor stipulates otherwise.

a. Use a 0.05 significance level to test the claim that unfiltered king-size cigarettes have a mean tar content greater than that of filtered 100 mm cigarettes.

b. What does the result suggest about the effectiveness of cigarette filters?

2. Listed below are systolic blood pressure measurements (mm Hg) taken from the right and left arms of the same woman. Use a 0.05 significance level to test for a difference between the measurements from the two arms. What do you conclude? Assume that the paired sample data are simple random samples and that the differences have a distribution that is approximately normal.

Submission Requirements: Submit the assignment in a Microsoft Word document. Save the Word document as Exercise 9.2_Your initials.doc. The answer to each question should be supported with appropriate rationale or steps.

Right arm

102

101

94

79

79

Left arm

175

169

182

146

144

## Testing Hypothesis Claim

A personal director in a particular state claims that the mean annual income is greater in one of the state’s counties (county A) than it is in another county (county B). In county A, a random sample 17 residents has a mean annual income of $42,000 and a standard deviation of $8500.

In County B, a random sample of 8 residents has a mean annual income of $38,300 and a standard deviation of $5100. At significance level 0.05.

Assume the population variances are not equal.

Identify the claim and state the Ho and Ha

Which is the correct claim:

(a) The mean annual incomes in counties A and B are equal.”

(b) The mean annual income in county A is less than county B.”

(c) The mean annual income in county A is greater than in county B.”

(d) The mean annual incomes in counties A and B are not equal.

I have to choose the Ho and Ha from drop down

What are the Ho and Ha?

The null hypothesis is Ho is ___. The alternative hypothesis Ha is ____

Which hypothesis is the claim?

The Alternative Ha or Null

Find the critical value(s) and identify the rejection region(s)

Critical Value =

Round three decimal places as needed.

Rejection region(s) =

Find Standardized test statistic

t=

Round three decimal places as needed

Decide whether to reject or fail to reject the null hypothesis.

______________ the null hypothesis

Interpret the decision in the context or the original claim

At the 5% significance level, ______ enough evidence to support the personnel directors claim.

## This post addresses the rejection region, z-score & p-value.

I need some help with the following questions:

How is the rejection region defined and how is that related to the z-score and the p value? When do you reject or fail to reject the null hypothesis? Why do you think statisticians are asked to complete hypothesis testing?

## Use NORMSINV(RAND()) to generate spreadsheet

Financial analysts often use the following model to characterize changes in stock prices:

Pt = Po*e^(u-0.5s^2)t+s*Z*t^0.5

where Po = current stock price

Pt – price at time t

u – mean (logarithmic) change of the stock price per unit time

s = (logarithmic) standard deviation of price change

Z = standard normal random variable

This model assumes that the logarithm of a stock’s price is a normally distributed random variable. Using historical data, one can estimate the values for u and s. Suppose that the average daily change for a stock is $0.003227, and the standard deviation is 0.026154. Develop a spreadsheet to stimulate the price of the stock over the next 30 days, if the current price is $53. Use the Excel function NORMSINV(RAND()) to generate values for Z. Construct a chart showing the movement in the stock price.

## Null hypothesis, alternative hypothesis

(a) Write the Claim mathematically and identify Ho and Ha

(b) Find the critical value(s) and identify the rejection region(s)

(c) Find the Standardized test statistics.

(d) Decide whether to reject or fail to reject the null hypothesis

In a sample of 1788 of home buyers, you find that 637 home buyers found their real estate agents through a friend. At 0.08, can you reject the claim that 40% of home buyers find their real estate agents through a friend?

Write the Claim mathematically and identify Ho and Ha

## Statistical Analysis of Performance of Basketball Players

Using the Internet, students will access information and statistics about professional basketball teams. They will use these data to explore relationships between player characteristics and their performance on the basketball court.

## Business Statistics

Explain the difference of your outlook on statistics after having studied it for whatever period of time you have done so as oppose to the time before taking up the study. (At least 1 Paragraph)

Give at least three examples and explain: How can statistics be used in everyday life.

(At least 2 Paragraph)

In your opinion, what is the most easily understandable and applicable concept in the study of statistics. Why?

(At least 1 Paragraph)

## Perform a one-tailed hypothesis test

A telemarketing company wants to find out if people are more likely to answer the phone between 8pm and 9pm than between 7pm and 8pm. Out of 96 calls between 7pm and 8pm, 72 were answered. Out of 105 calls between 8pm and 9pm, 90 were answered.

Using a one-sided hypothesis test with a 90% confidence level, which of the following statements do these data support?

Source

*

There is not sufficient evidence that the proportion of people who answer the phone between 8pm and 9pm is greater than the proportion who answer the phone between 7pm and 8pm.

*

People are more likely to answer the phone between 8pm and 9pm.

*

Telemarketers should not call at all during the evenings.

*

People are more likely to answer the phone between 7pm and 8pm.

## Hypothesis Testing: SPSS Tables

You are interested in understanding whether the average Lifetime Value (the net present value of all the transaction expected from a customer) for the customers in the Floridais significantly different from those in California. You suspect that the CaliforniaLTV (Lifetime value) would be higher than in Florida. In order to test this hypothesis, you pick 10 customers each in the California and Florida and record their LTV(in dollars). The data is in LTV.sav. State the appropriate hypothesis, run the appropriate test using SPSS and provide interpretation.