12 Jun 2022

412

Does a pitcher’s ERA predict the number of wins the team has?

Format: APA

Academic level: University

Paper type: Statistics Report

Words: 827

Pages: 3

Downloads: 0

Introduction 

Sports ranks as a critical societal endeavor. Besides the various sports, Baseball is an important recognized sport, a community´s pride, and other socio-economic benefits. In Baseball games, winning is pivotal. It indicates numerous practice hours coming to fruition, is critical in boosting team´s and personal confidence, and is economically beneficial to owners. And since winning forms as a vital measure of validation, this study aims to inquire the contribution of earned average runs (ERA) to the subsequently reported wins (W) by the respective teams and help make data-supported conclusions for positive implications. 

Research Question 

Does a pitcher’s ERA predict the number of wins the team has? 

By definition “ Earned run average represents the number of earned runs a pitcher allows per nine innings ” (Conor, 2019). As part of this research question, this report will focus on the hypothesis below, which will be proved through inferential statistics; 

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

H 0 : A pitcher’s ERA does not predict the number of wins the team has ( Null hypothesis

H a : A pitcher’s ERA predicts the number of wins the team has ( Alternate hypothesis

Study Design 

The sample population comprised 463 players, with the interest variables being wins recorded and Earned Run Average (ERA) associated for each pitching player. There was a reliance on secondary data collection of the various data collection tools, as the data was primarily online, obtained from Fan Graphs, and the statistics ranging from 1-1-2010 to 12-18-2020. With a need in attaining specific outcomes in this research, the sampling procedure was primarily purposeful sampling, which in its adoption, focuses on ensuring there is increasing researcher´s understanding by relying on samples that give the best opportunity for extensive learning (Merriam, 2015). In this case, with data from Fan Graphs available and covering a comprehensive sample ( N= 463), it was sufficient in meeting the research aims. 

Exploring the Data 

In this study, the main interest variables encompass wins (W) and earned run average (ERA), with data extracted from Fan Graphs, that comprised other additional data. Using Excel´s “Descriptive Statistics” tool, the following tabulation shows the data summarization. 

Table 1 Descriptive Statistics for Wins (W) and Earned Run Average (ERA) 

 

ERA 

 
       
Mean 

41.24190065 

Mean 

3.944406048 

Standard Error 

1.298628738 

Standard Error 

0.029282767 

Median 

32 

Median 

3.98 

Mode 

32 

Mode 

3.98 

Standard Deviation 

27.94315919 

Standard Deviation 

0.630090029 

Sample Variance 

780.8201453 

Sample Variance 

0.397013445 

Kurtosis 

3.342755718 

Kurtosis 

-0.127024955 

Skewness 

1.698960955 

Skewness 

-0.085807499 

Range 

158 

Range 

3.43 

Minimum 

Minimum 

2.17 

Maximum 

166 

Maximum 

5.6 

Sum 

19095 

Sum 

1826.26 

Count 

463 

Count 

463 

Confidence Level(95.0%) 

2.551950943 

Confidence Level(95.0%) 

0.057543917 

Table 1 offers useful statistics for the sampled players ( N= 463) from 1-1-2010 to 12-18-2020 from Fan Graphs. On wins (W), it's seen that the players, the recorded median wins is 32 ( M= 41.24, SD= 27.94), while during the same period, the range wins were 158, with a minimum of 8 wins, and a maximum of 166 wins. And with a Skewness statistic of 1.698960955 for the wins (W), this depicts the data as not bell-shaped, i.e., for wins, there is no normal distribution, as shown with the box plot below (Figure 1), with the median (X) more on the top whisker. Also, from the Figure 1 boxplot, there are some outliers in the dataset from 1-1-2010 to 18-12-2020. 

Figure 1 Box plot for Wins (W) from 2010-2020 

On ERA, the median recorded value is 3.98 ( M= 3.94, SD= 0.63), covering a variability depicted by the recorded Range of 3.43, encompassing 2.17 (minimum) and 5.6158 (maximum). Based on the data, one can deduce if there is normal distribution by deciphering information from the shown skewness statistics, i.e., -0.085807499. This value is almost 0, depicting normal distribution than seen in wins (W). The box plot below (Figure 2) shows this, as the median (X) value is lying in the middle, with no noticeable outliers in the dataset. 

Figure 2 Box plot for ERA from 2010-2020 

Results 

In successfully answering this study´s research question ( Does a pitcher’s ERA predict the number of wins the team has ?) and offering conclusions on the listed hypothesis, regression analysis was applied. In making significant conclusions, defining one´s reference alpha is vital (Salkind, 2016), of which, in this study, α=005, was adopted in this in-depth analysis. 

Table 2 Summary Excel Output 

SUMMARY OUTPUT               
                 

Regression Statistics 

             
Multiple R 

0.133128 

             
R Square 

0.017723 

             
Adjusted R Square 

0.015592 

             
Standard Error 

27.72445 

             
Observations 

463 

             
                 
ANOVA                 
 

df 

SS 

MS 

Significance F 

     
Regression 

6393.406 

6393.406 

8.317759 

0.00411 

     
Residual 

461 

354345.5 

768.6453 

         
Total 

462 

360738.9 

           
                 
 

Coefficients 

Standard Error 

t Stat 

P-value 

Lower 95% 

Upper 95% 

Lower 95.0% 

Upper 95.0% 

Intercept 

64.52947 

8.176754 

7.89182 

2.19E-14 

48.46114 

80.5978 

48.46114 

80.5978 

ERA 

-5.90395 

2.047102 

-2.88405 

0.00411 

-9.92676 

-1.88114 

-9.92676 

-1.88114 

With a predefined alpha as a tool in concluding either to reject or accept a previously defined H0, if it is lesser or greater than the value (Salkind, 2016), in this case, the result is “reject H0”, as the Significance F value (0.00411) is way smaller compared to our 0.05. Using the above summary (Table 2), its conclusive from the model that the effect of ERA on Wins is statistically significant, ( F (1, 461) = 8.317759, p < 0.00411, R 2 = 0.017723). Likewise, ERA as the predictor variable in this case is significant in predicting Wins (W) ( t = -2.88405, p = 0.00411). Also, the scatter plot is an essential tool helpful in depicting association (Salkind, 2016), of which, as shown in Figure 3, the association is negative. 

Figure 3 Scatterplot of Wins (W) and ERA for data from 1-1-2010 to 12-18-2020 

As shown in Figure 3, Wins (W) and the corresponding ERA values are negatively correlated, with their equation as; 

y = -5.9039x + 64.529 Conclusion 

The analysis helps answer the research question, as proved through regression analysis's statistical significance values. When focused on wins (W), the attained ERA significantly predicts the baseball teams' resulting wins. These results align or agree with that found by Conor (2019). As a means of increasing wins, teams need to work towards minimizing their ERA, as the association is negative. 

Further Study 

First, this study only focused on ERA and its implication on wins (W), yet other essential variables exist and applying multiple regression could have been better. This is because there was only one independent variable in this study 

Secondly, ERA is not the only aspect that impacts winnings in a game of Baseball. Based on the collected data, there are other variables. Hence, further studies can examine additional variables, e.g., fielding, batting measures, wins-above-replacement (WAR), Exit velocities (EV), among other baseball statistics. Through this process, better findings can be shown and help improve decisions. 

References 

Conor, W. (2019). Batting, Pitching, or Fielding: What’s Most Important in Today’s MLB? Sanford University. https://www.samford.edu/sports-analytics/fans/2019/Batting-Pitching-or-Fielding-Whats-Most-Important-in-Todays-MLB 

Fan Graphs (2020). Data: Leaderboards. https://www.fangraphs.com/leaders.aspx?pos=all&stats=pit&lg=all&qual=y&type=8&season=2020&month=0&season1=2010&ind=0&team=0&rost=0&age=0&filter=&players=0&startdate=2010-01-01&enddate=2020-12-31&sort=2,d 

Merriam, S. B. (2015). Qualitative research: A guide to design and implementation (4th ed.). San Francisco, CA: Jossey-Bass Publishers. 

Top of Form 

Bottom of Form 

Salkind, N. J. (2016).  Statistics for people who (think they) hate statistics . Thousand Oaks: SAGE Publications. 

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 14). Does a pitcher’s ERA predict the number of wins the team has?.
https://studybounty.com/does-a-pitchers-era-predict-the-number-of-wins-the-team-has-statistics-report

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

17 Sep 2023
Statistics

Scatter Diagram: How to Create a Scatter Plot in Excel

Trends in statistical data are interpreted using scatter diagrams. A scatter diagram presents each data point in two coordinates. The first point of data representation is done in correlation to the x-axis while the...

Words: 317

Pages: 2

Views: 186

17 Sep 2023
Statistics

Calculating and Reporting Healthcare Statistics

10\. The denominator is usually calculated using the formula: No. of available beds x No. of days 50 bed x 1 day =50 11\. Percentage Occupancy is calculated as: = =86.0% 12\. Percentage Occupancy is calculated...

Words: 133

Pages: 1

Views: 150

17 Sep 2023
Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

Null: There is no difference in the survival rate of COVID-19 patients in tropical countries compared to temperate countries. Alternative: There is a difference in the survival rate of COVID-19 patients in tropical...

Words: 255

Pages: 1

Views: 250

17 Sep 2023
Statistics

5 Types of Regression Models You Should Know

Theobald et al. (2019) explore the appropriateness of various types of regression models. Despite the importance of regression in testing hypotheses, the authors were concerned that linear regression is used without...

Words: 543

Pages: 2

Views: 175

17 Sep 2023
Statistics

The Motion Picture Industry - A Comprehensive Overview

The motion picture industry is among some of the best performing industries in the country. Having over fifty major films produced each year with different performances, it is necessary to determine the success of a...

Words: 464

Pages: 2

Views: 86

17 Sep 2023
Statistics

Spearman's Rank Correlation Coefficient (Spearman's Rho)

The Spearman’s rank coefficient, sometimes called Spearman’s rho is widely used in statistics. It is a nonparametric concept used to measure statistical dependence between two variables. It employs the use of a...

Words: 590

Pages: 2

Views: 309

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration