28 Nov 2022

55

Descriptive Statistics and Hypothesis Testing: Everything You Need to Know

Format: APA

Academic level: College

Paper type: Essay (Any Type)

Words: 1189

Pages: 6

Downloads: 0

High school graduation is among the first and most important milestones in the life of young scholars. Unfortunately, a few scholars fail to graduate from high school. There are several reasons why a student may fail to graduate -from being homeless, coming from an economically disadvantaged family, or even having a less-than-ideal home life. Due to these and many other reasons, the high school graduation rates in the United States is not 100%. In addition, the rates vary across states as well as across student demographic groups. In this paper, sample data of the high school graduation rates by state in 2020 will be collected and analyzed. The analysis will include the calculation of descriptive statistics, such as the mean and the standard deviation, and confidence intervals (CI). 

The sample data for high school graduation rates by states were obtained from the World Population Review website, a website that collects demographic data on the population of countries and cities. The data is available at https://worldpopulationreview.com/state-rankings/high-school-graduation-rates-by-state . Table 1 shows the data that was retrieved from this website. 

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

Table 1: High School Graduation Rates by State: 2020 

93.20 

91.90 

90.40 

88.60 

86.50 

93.00 

91.80 

90.40 

88.00 

86.30 

92.90 

91.70 

90.20 

88.00 

86.20 

92.90 

91.40 

90.10 

87.80 

85.80 

92.70 

91.10 

90.00 

87.40 

85.70 

92.60 

91.10 

89.80 

87.10 

85.30 

92.50 

90.70 

89.60 

87.00 

84.80 

92.30 

90.60 

89.50 

86.80 

83.90 

92.00 

90.50 

89.30 

86.70 

83.20 

92.00 

90.50 

88.90 

86.50 

82.90 

Source: World Population Review (2020). 

Using the sample data collected, the sample mean and standard deviation were calculated using Excel function. Table 2 shows the results obtained. 

Table 2: Mean and Standard Deviation 

Mean 

89.20 

Standard Deviation 

2.7844 

The mean is the average of the sample data. In Excel, the "AVERAGE" function is used to calculate the mean. Standard deviation is a measure of the amount of variation between values in a given dataset. The mean and standard deviation of the collected sample data is 89.20% and 2.7844, respectively. Other than the descriptive statistics, 80%, 95%, and 99% CIs was calculated using the sample data. The formula for calculating CI is shown below: 

Where, 

80% CI 

For 80% CI, 

Upper Level 

Lower Level 

Thus, 

Margin of Error 

95% CI 

Upper Level 

Lower Level 

Thus, 

Margin of Error 

99% CI 

Upper Level 

Lower Level 

Thus, 

Margin of Error 

My Own CI 

98% CI 

Upper Level 

Lower Level 

Thus, 

Margin of Error 

Analysis and Reflection 

As the confidence level rises, the margin of error increases as well. The error margin is influenced by three parameters: the sample size, the standard deviation, and the confidence level. In our case, the sample size and the standard deviation remained constant, whereas the confidence level was changed. This means that as the confidence level increases, the margin of error increases and vice versa. The margin of error increases because the critical value increases as the confidence level increases. 

For 80% CI, the upper limit and lower limit was found to be 89.7060 and 88.6980, respectively. This means that we are 80% confident that states' high school graduation rates in 2020 lie between 89.7060 and 88.6980. For 95% CI, the upper limit and lower limit was found to be 89.9738 and 88.4302, respectively. This means that we are 95% confident that states' high school graduation rates in 2020 lie between 89.9738 and 88.4302. For 99% CI, the upper limit and lower limit was found to be 90.2179 and 88.1861, respectively. This means that we are 99% confident that states' high school graduation rates in 2020 lie between 90.2179 and 88.1861. Lastly, for 98% CI, the upper limit and lower limit was found to be 90.1195 and 88.2845, respectively. This means that we are 98% confident that states' high school graduation rates in 2020 lie between 90.1195 and 88.2845. 

Part I of the statistics project has helped me learn how to calculate and interpret CIs. I can calculate the CI for any interval level as long as the sample data or sample parameters are provided. The project has also helped me learn how to determine the margin of error as well as determine the relationship between the margin of error and confidence intervals. Overall, this project has helped me understand the concept of confidence intervals better. 

Statistics Project #2: Hypothesis Testing 

Hypothesis testing is a method in statistics that involves testing assumptions on a given population parameter. Using sample data, one can assess the plausibility of a hypothesis through a hypothesis test. In this paper, hypothesis testing will be used to determine if a given claim is true. Sample data that pertains to birth, death, marriages, and divorces will be retrieved from the Centers for Diseases Control and Prevention (CDC), a public health institute whose main aim is to protect public health safety. 

The data that was retrieved from the CDC was collected by Rate N and published in 2009 in the National Vital Statistics Reports. The report is composed of a wide range of data sets. However, only the data sets for 2009 will be used in this paper for analysis. Table 1 shows this data. 

Table 1: Births, Deaths, Marriages, and Divorce by State, 2009 

State 

Live Births 

Deaths 

Marriages 

Divorces 

Alabama 

5,352 

4,330 

2,684 

1,651 

Alaska 

861 

286 

361 

381 

Arizona 

7,775 

4,026 

3,236 

1,916 

Arkansas 

3,400 

2,590 

2,489 

1,355 

California 

45,831 

21,135 

15,208 

- - ­ 
Colorado 

5,572 

2,824 

1,316 

1,901 

Connecticut 

3,060 

2,477 

2,046 

1,016 

Delaware 

891 

658 

321 

231 

District of Columbia 

639 

338 

104 

115 

Florida 

18,622 

14,624 

10,002 

6,055 

Georgia 

11,884 

6,039 

4,250 

- - ­ 
Hawaii 

1,573 

774 

4,250 

- - ­ 
Idaho 

1,849 

999 

820 

635 

Illinois 

14,166 

10,031 

5,129 

2,740 

Indiana 

7,112 

4,993 

4,923 

- - ­ 
Iowa 

3,247 

2,513 

796 

530 

Kansas 

3,469 

2,104 

1,683 

1,027 

Kentucky 

4,742 

2,996 

1,714 

1,634 

Louisiana 

5,659 

3,078 

624 

- - ­ 
Maine 

1,052 

1,256 

612 

279 

Maryland 

6,411 

3,804 

2,170 

1,288 

Massachusetts 

6,010 

4,576 

2,075 

1,274 

Michigan 

9,206 

7,585 

2,895 

3,006 

Minnesota 

5,765 

3,335 

1,333 

- - ­ 
Mississippi 

3,703 

2,406 

865 

821 

Missouri 

6,472 

4,820 

2,306 

1,814 

Montana 

936 

761 

398 

399 

Nebraska 

2,190 

1,304 

306 

153 

Nevada 

3,292 

1,761 

7,416 

1,531 

New Hampshire 

1,031 

846 

451 

279 

New Jersey 

8,946 

6,126 

2,349 

1,957 

New Mexico 

2,292 

1,259 

952 

626 

New York 

21,072 

13,221 

8,442 

4,171 

North Carolina 

10,492 

6,916 

4,448 

2,895 

North Dakota 

741 

505 

220 

71 

Ohio 

11,691 

9,577 

4,067 

2,312 

Oklahoma 

4,802 

3,165 

1,987 

1,475 

Oregon 

3,663 

2,779 

1,433 

1,067 

Pennsylvania 

11,991 

11,397 

3,721 

3,138 

Rhode Island 

933 

802 

337 

222 

South Carolina 

5,086 

3,615 

1,835 

939 

South Dakota 

990 

617 

314 

242 

Tennessee 

6,909 

5,312 

5,729 

2,148 

Texas 

34,363 

14,469 

11,867 

2,014 

Utah 

4,223 

1,243 

1,743 

1,849 

Vermont 

458 

450 

277 

279 

Virginia 

8,697 

5,245 

3,839 

2,793 

Washington 

6,937 

4,054 

2,711 

2,187 

West Virginia 

1,768 

1,907 

747 

736 

Wisconsin 

5,415 

3,784 

1,184 

1,525 

Wyoming 

605 

352 

81 

79 

Puerto Rico 

3,226 

1,680 

2,450 

1,661 

Source: Rate N (2009). 

Preliminary Calculations 

The preliminary calculations calculated include the mean, median, sample standard deviation, and minimum and maximum values for each of the data sets. The results are summarized in Table 2, 3, 4, and 5. 

Table 2: Summary Table for Live Births 

Summary Table for Live Births   
Mean 

6,674 

Median 

4,772 

Standard Deviation 

8,106 

Minimum 

458 

Maximum 

45,831 

Table 3: Summary Table for Deaths 

Summary Table for Deaths   
Mean 

4,187 

Median 

2,910 

Standard Deviation 

4,293 

Minimum 

286 

Maximum 

21,135 

Table 4: Summary Table for Deaths 

Summary Table for Marriages   
Mean 

2,760 

Median 

1,911 

Standard Deviation 

3,053 

Minimum 

81 

Maximum 

15,208 

Table 5L Summary Table for Divorces 

Summary Table for Divorces   
Mean 

1,444 

Median 

1,322 

Standard Deviation 

1,185 

Minimum 

71 

Maximum 

6,055 

Hypothesis Testing 

Using the sample data and preliminary calculations, a number of hypothesis tests were performed, which are as follows: 

Determine if there is sufficient evidence to conclude the average amount of births is over 5000 in the United States and territories at the 0.05 level of significance. 

Since, 

And 

The result is not significant at p<0.05. So, we accept the null hypothesis and conclude that there is insufficient evidence to conclude the average number of births is over 5000 in the United States and territories at the 0.05 level of significance. 

Determine if there is sufficient evidence to conclude the average amount of deaths is equal to 6000 in the United States and territories at the 0.10 level of significance. 

Since, 

And 

The result is not significant at p<0.10. So, we accept the null hypothesis and conclude that there is not sufficient evidence to conclude that the average amount of deaths is equal to 6000 in the United States and territories at the 0.10 level of significance 

Determine if there is sufficient evidence to conclude the average amount of marriages is greater or equal to 2500 in the United States and territories at the .05 level of significance. 

Since, 

And 

The result is not significant at p<0.10. So, we accept the null hypothesis and conclude that there is not sufficient evidence to conclude that the average amount of marriages is greater or equal to 2500 in the United States and territories at the .05 level of significance 

Determine if there is sufficient evidence to conclude the average amount of divorces is less than or equal to 4000 in the United States and territories at the 0.10 level of significance. 

Since, 

And 

The result is not significant at p<0.10. So, we accept the null hypothesis and conclude that there is insufficient evidence to conclude that the average amount of divorces is less than or equal to 4000 in the United States and territories at the 0.10 level of significance. 

References 

Rate, N. R. N. R. N. (2009). National Vital Statistics Reports.  National Vital Statistics Reports 57 (13). 

World Population Review. (2020). High school graduation rates by state 2020. https://worldpopulationreview.com/state-rankings/high-school-graduation-rates-by-state 

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 16). Descriptive Statistics and Hypothesis Testing: Everything You Need to Know.
https://studybounty.com/descriptive-statistics-and-hypothesis-testing-everything-you-need-to-know-essay

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

17 Sep 2023
Statistics

Scatter Diagram: How to Create a Scatter Plot in Excel

Trends in statistical data are interpreted using scatter diagrams. A scatter diagram presents each data point in two coordinates. The first point of data representation is done in correlation to the x-axis while the...

Words: 317

Pages: 2

Views: 187

17 Sep 2023
Statistics

Calculating and Reporting Healthcare Statistics

10\. The denominator is usually calculated using the formula: No. of available beds x No. of days 50 bed x 1 day =50 11\. Percentage Occupancy is calculated as: = =86.0% 12\. Percentage Occupancy is calculated...

Words: 133

Pages: 1

Views: 150

17 Sep 2023
Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

Null: There is no difference in the survival rate of COVID-19 patients in tropical countries compared to temperate countries. Alternative: There is a difference in the survival rate of COVID-19 patients in tropical...

Words: 255

Pages: 1

Views: 251

17 Sep 2023
Statistics

5 Types of Regression Models You Should Know

Theobald et al. (2019) explore the appropriateness of various types of regression models. Despite the importance of regression in testing hypotheses, the authors were concerned that linear regression is used without...

Words: 543

Pages: 2

Views: 175

17 Sep 2023
Statistics

The Motion Picture Industry - A Comprehensive Overview

The motion picture industry is among some of the best performing industries in the country. Having over fifty major films produced each year with different performances, it is necessary to determine the success of a...

Words: 464

Pages: 2

Views: 86

17 Sep 2023
Statistics

Spearman's Rank Correlation Coefficient (Spearman's Rho)

The Spearman’s rank coefficient, sometimes called Spearman’s rho is widely used in statistics. It is a nonparametric concept used to measure statistical dependence between two variables. It employs the use of a...

Words: 590

Pages: 2

Views: 309

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration