13 Dec 2022

86

Linear Regression in Python

Format: APA

Academic level: Master’s

Paper type: Case Study

Words: 457

Pages: 2

Downloads: 0

Limitation 

Linear regression is a good method of forecasting future values. However, it has shortcomings just like any other prediction model. One of the main limitations of the model is multicollinearity. Multicollinearity occurs when two or more independent (predictor) variables have a strong correlation (Kamer-Ainur & Mariorara, 2017). In such cases, it becomes difficult to include such variables into the regression model. Therefore, in cases where two independent variables have a strong correlation, one of them should be removed from the regression analysis. The section below describes two cases in which multicollinearity was present. 

Executive Summary of Multicollinearity Cases 

On the first case, a researcher was studying the effect of working capital management on the capital structure of non-financial companies. Specifically, the research study sought to find out the factors that affect capital structure of companies. In this case, the independent variables were: the average payment period, inventory conversion period, average payment period, fixed to total asset ratio and the cash ratio. The dependent variable was debt ratio (Mwangi, 2017). After reviewing the research report, it was evident that the cash ratio and the fixed to total asset ratio had multicollinearity. Both the variables were however included in the research model. Even though multicollinearity does not affect the overall significance of a regression model, the model might give duplicated results with regards to the strongly correlated variables. That is, since the variables have multicollinearity, their coefficients are essentially duplicated in the regression model (Gujarati, 2015). 

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

The two independent variables had a correlation of approximately 93%, which is very strong for two independent variables since it is close to 100%. As stated above, one of the two variables should be dropped. A scatter plot of these variables is as shown in the figure below. 

Source: Mwangi (2017) 

As seen in the figure above, fixed to total asset ratio and cash ratio are highly correlated, displaying multicollinearity. 

The second case involves a study conducted to check some of the factors that affect uptake of loans from group savings (Kilele et al., 2015). The research study had several independent variables and one dependent variables. The independent variables were: group rules, interest rates, fear of repayment, collateral and lack of enough group savings to satisfy members’ requests. The dependent variable was uptake of loans (Kilele et al., 2015). After a critical review of the independent variables, it was evident that fear of repayment and group rules had a very strong correlation, which translates to multicollinearity. These two variables therefore had the same effects on the regression model. The variables had a significant correlation of 96%, which is very strong for two independent variables in the same regression model. Using the two variables meant that there was duplication. 

Source: Kilele et al., (2015) 

The figure above shows a scatter plot of the relationship between fear of repayment and group rules. From the graph, it is evident that the two variables have a strong linear relationship, which is an evidence of multicollinearity. 

References 

Gujarati, D. (2015). Regression Diagnostic I: Multicollinearity.  Econometrics , 80-95. doi:10.1007/978-1-137-37502-5_4 

Kamer-Ainur, A., & Marioara, M. (2007). Errors and limitations associated with regression and correlation analysis.  Statistics and Economic Informatics , 709. 

Kilele, A. K., Nduruhu, D., & Kimani, M. E. (2015). Determinants of Group Loans Uptake at The Youth Enterprise Development Fund, 25. 

Mwangi, J. M. (2017). Effect of working capital management on the capital structure of non-financial firms.  International Journal of Economics, Finance and Management Sciences 2 (3), 271. 

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 16). Linear Regression in Python.
https://studybounty.com/linear-regression-in-python-case-study

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

17 Sep 2023
Statistics

Scatter Diagram: How to Create a Scatter Plot in Excel

Trends in statistical data are interpreted using scatter diagrams. A scatter diagram presents each data point in two coordinates. The first point of data representation is done in correlation to the x-axis while the...

Words: 317

Pages: 2

Views: 187

17 Sep 2023
Statistics

Calculating and Reporting Healthcare Statistics

10\. The denominator is usually calculated using the formula: No. of available beds x No. of days 50 bed x 1 day =50 11\. Percentage Occupancy is calculated as: = =86.0% 12\. Percentage Occupancy is calculated...

Words: 133

Pages: 1

Views: 151

17 Sep 2023
Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

Null: There is no difference in the survival rate of COVID-19 patients in tropical countries compared to temperate countries. Alternative: There is a difference in the survival rate of COVID-19 patients in tropical...

Words: 255

Pages: 1

Views: 251

17 Sep 2023
Statistics

5 Types of Regression Models You Should Know

Theobald et al. (2019) explore the appropriateness of various types of regression models. Despite the importance of regression in testing hypotheses, the authors were concerned that linear regression is used without...

Words: 543

Pages: 2

Views: 175

17 Sep 2023
Statistics

The Motion Picture Industry - A Comprehensive Overview

The motion picture industry is among some of the best performing industries in the country. Having over fifty major films produced each year with different performances, it is necessary to determine the success of a...

Words: 464

Pages: 2

Views: 86

17 Sep 2023
Statistics

Spearman's Rank Correlation Coefficient (Spearman's Rho)

The Spearman’s rank coefficient, sometimes called Spearman’s rho is widely used in statistics. It is a nonparametric concept used to measure statistical dependence between two variables. It employs the use of a...

Words: 590

Pages: 2

Views: 309

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration