Introduction
Hypothesis testing refers to the application of inferential statistics procedure to evaluate if an assumption pertaining population parameter is supported by a reliable sample data collected from the population. There are three core phases of hypothesis testing with several steps based on the test statistic approach being applied. The first phase involves collecting credible and reliable data through a scientific research approach. The second phase pertains to selecting the appropriate statistical test depending on the nature of the data and objectives. The last step involves conducting an inferential test using the data collected and the most appropriate method to make a credible conclusion based on findings to predefined decision rule ('Chapter 8', nd). The article focuses on the use of hypothesis tests through a parametric test. Specifically, four tests that will be addressed are; z-score, t-test, correlation and regression, and chi-square based on data provided and hypothesis being tested. Chi-square is applied when we have categorical data (‘Chapter 10’, nd) Literature Review
Assumptions made to population parameter is known as a hypothesis. There are two types of hypothesis which must be stated initially before the statistical test can be conducted. In addition, the hypothesis act as a guide to a researcher on data required and the tests to apply. The null hypothesis is the claim considered to be true, while the alternative hypothesis is a complementing statement to null. The null hypothesis can be true or false. The statistical test gives a significant value that a researcher refers to when making a conclusion on the reliability and credibility of null based on if the sample data collected (‘Chapter 7', nd).
Delegate your assignment to our experts and they will do the rest.
The decision on the statistical test can be based on p-value, confidence interval, or critical value. The significant value calculated can support the null hypothesis or, in some cases, fail. A test value that does not support the null hypothesis means it's in favor of an alternative hypothesis, and thus in such situations, we reject the null hypothesis. Significant values calculated don't prove either of the statement to be true, it only shows the probability of a claim to be relevant to population parameter being investigated based on data available ('Chapter 7', nd).
Project Questions
Q-1. A random sample data
Q1a. How many data items? (n)
Items in a sample data refer to the total count of observations made, which are recorded on the datasheet. Count of sample items in statistics is denoted using notation 'n'. The count for items is obtained through inbuilt formula in excel =COUNT (A1: A31), or descriptive statistics summary output
There are 31 items on data provided, showing there were 31 observations made and values recorded in continuous, ordinal data.
Q1b. What is the sample Mean?
Mean refers to the calculated average of sample data. The sample mean is calculated using the formula
Mean =
In excel, mean can be computed using an inbuilt formula or be obtained through descriptive statistics analysis using an excel data analysis tool pack.
For data provided, computed mean for 31 random variable observation made is 43.45 value given in two decimal places.
Q1c. What is the sample variance?
Variance is a measure of central tendency, which shows the difference between observation made to sample or parameter mean. Variance is obtained by first calculating the mean then subtracting the mean from the observed item.
The sample variance for data provided as computed through inbuilt excel formula and descriptive statistics output is 977.59 (2dp).
Q1d. What is the sample standard deviation?
The computed standard deviation is 31.27 into the figure given into 2 decimal places.
Q1e. Based on the number of sample items, will you be using a z or a t for the mean testing?
There are more than 30 I will use the t-test. In addition, the population mean and standard deviation are not given, making t-test the only statistically acceptable approach to carry a hypothesis test for my data.
Q1f. 90% confidence interval for the mean.
90% confidence level calculated sample mean means only 10% of normal distribution curve would be considered a critical region. Because the distribution curve has left and right side, each side accounts for 5% of the critical region on their respective tails.
A confidence interval for the t-test is computed using excel inbuilt formula critical boundary = T.INV(α/2,df) for a two-tailed test.
n | 31 |
df | 30 |
α= | 0.1 |
α/2 | 0.05 |
t-value | -1.697 |
Computed standard normal distribution for the sample mean at αo.05 and degree of freedom 30 is ±1.697(3 decimal places).
The lower boundary on the left side of the distribution curve for the sample mean at a confidence level of 90% is -1.697, and the upper is +1.697. Hence, the confidence interval for the sample mean is from -1.697 up to +1.697.
t-critical | ±1.697 |
p-critical | 0.05 |
2-tailed p-critical= | 0.1 |
Q1g. Use Hypothesis Testing (P-Value Method) for the claim that the mean is different from 40 at a 94% confidence level.
Research problem - does the mean of 40 differ from sample mean?
Data
Claim Sample mean ≠ 40
Hypothesized Mean - 40
Hypothesis
Null hypothesis: The sample mean is 40. : = 40
Alternative Hypothesis: The sample mean is greater or less than 40. : ≠ 40
Statistical Test - independent t-test two-tailed
Confidence level - 94%
Significant level – α=0.06
Decision Rule
Calculated p-value < α – reject the null hypothesis
Data analysis
confidence level | 94% |
n | 31 |
df | 30 |
α= | 0.06 |
α/2 | 0.03 |
Standard Error | 5.62 |
mean1 | 43.45 |
mean2 | 40 |
mean difference | -3.45 |
test-statistic | -0.61 |
critical p-value | 0.06 |
p-value | 0.27 |
2-tailed p-value | 0.54 |
Findings
Calculated p-value (0.54) > α (0.06)
We fail to reject the null hypothesis
Conclusion
Based on the analysis data provided support, the claim sample mean is equal to 40, and not less or more at a confidence level of 94%.
Q2- Random Sample
Q2a) How many data items? n = ?
The computed count for observations recorded on the data provided is ten. There are 10 items making my sample data.
Q2b) What is the sample mean?
The average is calculated as = . The mean for ten observation made is 33.5
Q2c) sample variance?
Computed sample variance is equal to 766.056
Q2d) sample standard deviation?
Standard deviation is calculated as the square root of variance. The computed sample standard deviation is equal to 27.678.
Q2e) z or a t for the mean testing?
I will use t-test in hypothesis testing because parameter mean and standard deviation are unknown.
Q2f) Form a 95% confidence interval for the mean.
95% confidence level | |
n | 10 |
df | 9 |
α= | 0.05 |
α/2 | 0.025 |
t-critical | -2.2622 |
t-critical | ±2.26 |
2-tailedp-critical= | 0.05 |
The lower boundary on the left side of the distribution curve for the sample mean at a confidence level of 95% and df 9 -2.26, and the upper is +-2.26. Hence, the confidence interval for a sample mean is from -2.26 up to +2.26
Q2g) Form a 97% confidence interval for the variance.
confidence level | 0.97 |
n | 10 |
df | 9 |
α= | 0.03 |
α/2 | 0.015 |
t-critical | -2.5738 |
t-critical | ±2.57 |
2-tailedp-critical= | 0.03 |
The lower boundary on the left side of the distribution curve for the sample mean at a confidence level of 97% and df 9 -2.57, and the upper is +2.57. Hence, the confidence interval for a sample mean is from -2.57 up to +2.57
Q2h) Use Hypothesis Testing (P-Value Method) for the claim that the mean is less than 30 at a 96% confidence level.
Research problem - is the sample mean less than?
Data
Claim Sample mean < 30
Hypothesized Mean ≥ 30
Hypothesis
Null hypothesis: The sample mean is greater or equal to 30. : ≥ 30
Alternative Hypothesis: The sample mean is less than 30. : < 30
Statistical Test - independent t-test left-tailed
Confidence level - 95%
Significant level – α=0.05
Decision Rule
Calculated p-value < α – reject the null hypothesis
Statistic test
confidence level | 0.95 |
n | 10 |
df | 9 |
α= | 0.05 |
Standard Error | 8.752 |
mean1 | 33.5 |
mean2 | 30 |
critical p-value | 0.05 |
mean difference | -3.5 |
test-statistic | -0.400 |
p-value | 0.349 |
Findings
Calculated p-value >α
Fail to reject the null hypothesis
Q2i) What is your conclusion for the Hypothesis Test, state it very clearly?
Based on the analysis of data provided at α 0.05 and df 9, the calculated p-value supports the claim sample mean greater or equal to 30.
Q3- A survey
How many people said YES, and how many people said NO?
Yes, and No count can be computed using =COUNTIF formula
count | 40 |
0 | 18 |
1 | 22 |
people said yes, and 19 said No
What is the sample size?
The sample size of the data is 40 responses.
What is the proportion of people in the sample that said YES?
22/40 = 0.55
55% of participants favored Soap XX
Form the 93% confidence interval for the proportion of people that FAVOR Soap XX (Said YES).
C.I | 0.93 | LOWER-LIMIT | 4.4451751 | |
α | 0.07 | upper limit | 0.0019255 | |
Lower α | α/2 | 0.035 | ||
upper α | α/2 | 0.965 |
The confidence interval at 7% is a significance level, and 1 df is as from 4.4451751 up- to 0.0019255 on the upper boundary.
Use Hypothesis Testing (P-Value Method) for the claim that the proportion of people that FAVOR Soap XX is greater than 50% at a 90% confidence level.
Statistical test - chi-square
Null Hypothesis: there is no significant difference between observed mean at 90% and the expected mean 50%
:
Alternative Hypothesis: there is a significant difference between observed mean at 90% and the expected mean 50%
:
If the confidence interval is 90%
Significance level =0.5
Decision –rule: Calculated chi-value is less than the critical value of α – fail to reject the null hypothesis
Test Analysis
category | hypothesized | observed | significance | expected | chi-value |
yes | 0.90 | 36 | 0.5 | 20 | 0.8 |
no | 0.10 | 4 | 0.5 | 20 | -0.8 |
total | 40 | 0 | |||
s | |||||
p-value | 0.45 | ||||
critical value | 0.5 |
What is your conclusion for the Hypothesis Test, state it very clearly?
Findings
Calculated chi-value =0
Critical chi-value = 0.5
Hence, calculated chi-value < Critical chi-value, I fail to reject the null hypothesis
Based on our sample, we cannot reject the null statement, since critical value obtained supports that there is a significant difference between the two means at 50% p-value.
Calculate the correlation coefficient “r”.
Regression Statistics | |
Multiple R | 0.77 |
R Square | 0.59 |
Adjusted R Square | 0.53 |
Standard Error | 12.06 |
Observations | 8 |
Find the equation of the regression line.
Coefficients | Standard Error | t Stat | P-value | |
Intercept | -31.46 | 22.02 | -1.43 | 0.20 |
FIRES (X) | 1.04 | 0.35 | 2.96 | 0.03 |
Y = 1.04X-31.46
c) State in words, “the correlation coefficient is between fire and acres 0.77, the slope of the best fit line is 1.04, and the y-intercept is negative 31.94
d) Use your best fit line with x = 60 fires to predict the number of acres y burned.
x | 60 |
slope | 1.04 |
y-intercept | -31.46 |
Y | 30.94 |
Reference
Chapter 7: Testing Hypotheses. (nd). Retrieved from: https://www.sagepub.com/sites/default/files/upm-binaries/43443_7.pdf
Chapter 8: Hypothesis Testing. (nd). Retrieved from: http://math.ucdenver.edu/~ssantori/MATH2830SP13/Math2830-Chapter-08.pdf
Chapter 10: Chi-Square Tests. (nd). Retrieved from: http://uregina.ca/~gingrich/ch10.pdfs