The firm size data contains the employment size of the firm, number of firms in each category of employment size, total employment in each category, wages per employee, revenues per firm, revenue per employee and the total annual wages spent by firms on employees. The employment size category is used to determine the midpoint of the firm size range that is used for analysis. This analysis compares the employer firms regarding the wages, revenue and the number of employees of the firms in each category. The analysis involves relevant visual representation, correlation analysis, regression analysis, and hypothesis testing. The regression analysis will be used to predict the desired dependent variables based on the appropriate explanatory variable(s).
Visual Representation of the Data
Visual representation provides a visual comparison of different components of data (Andersson & Olofsson, 2013). In this case, column chart, scatter plot and pie chart are used in visual representation analysis. Using the data on employment size range and the number of firms in each category, a column chart is plotted to compare the number of firms in each category. The column chart is as follows:
Delegate your assignment to our experts and they will do the rest.
From the column chart, it can be seen that the highest number of small business in the US are those with 0-4 employees. Notably, the number of small business firms reduces as the number of employees in those businesses increases. This visual representation provides a real picture of small business firms in the US. Small business firms include those firms with a small number of employees, hence the name small business. It can be seen that the number of firms with over 50000 employees is negligible and takes a very small portion of the total small business firms in the US.
The other plot is the pie chart that is used to compare the total annual wages that firms in each category spend on their employees. The following is the pie chart:
The pie chart clearly shows that those firms with employees over 5000 spend more than half, particularly 65%, of the total wages that all the small business firms spend on their employees. In other words, the small business firms with few employees spend less on total annual employee wages than those firms with a large number of employees. Notably, irrespective of the fact that the number of firms with a large number of employees are fewer than the small size firms, they still spend more than those small firms do. The possible reason behind this scenario is that there are many employees employed by firms with large firm size than the employees employed by small size firms.
To determine the distribution of the number of firms in each category of firm size, a histogram is used. The plot for a histogram for this data is shown below:
The histogram has a longer tail on the right side than on the left side. Therefore, the distribution of the number of firms in each category of firm size appears to be a positively skewed distribution. Therefore, the data is not normally distributed since the shape of the histogram is not symmetric. The implication of this distribution is that the mean number of firms is greater than the median, which is greater than the mode (Hayslett & Murphy, 2014). Therefore, the mode or the most occurring number is towards the left of the graph, implying that there many small size firms than the large size firms.
To compare the revenue and wages per employee in all the categories of firm size, a scatter plot is used. This plot provides a visual representation of the dependent and independent variables. In this case, the dependent variable is the revenue per employee and the independent variable is wages per employee. The common belief is that employees are likely to be more productive when their wages are increased. Therefore, revenue generated by each employee would depend on the wage that the employee receives. The following is a scatter plot of revenue versus wages per employee:
It can be seen from the plot that the relationship between revenue and wages per employee is approximately a strong positive relationship. The scatter plots show to be in one line with a positive slope. Therefore, by seeing the plot, one can easily note that the revenue per employee would increase and wages per employee goes up and reduces when the wages reduce.
Correlation Analysis
A correlation analysis is important in determining the nature and the strength between pairs of variables. In this case, the pairs of variables that have a significant relationship are shown in the table below:
Midpoint of Firm Size Range |
Wages per Employee (in dollars) |
|
Total Employment in Category |
0.961242037 |
|
Revenues per Firm (in $1000s) |
0.998912417 |
0.520532634 |
Revenue per Employee (in dollars) |
0.709085756 |
0.942687226 |
Total Annual Wages |
0.999817738 |
0.540621284 |
The Pearson’s correlation coefficients of each pair of the variables are obtained from the correlation matrix output in excel. The correlation matrix calculates the correlation coefficients of the pairs of variables in the data. Therefore, the correlation coefficients figures are calculated using excel. From the section of correlation matrix above, there are four pairs of variables having significant relationships. The correlation coefficients of these variables are shown in the blue cells.
Particularly, the correlation between total employment in the category and firm size is 0.961, revenue per firm and firm size is 0.999, revenue per employee and wages per employee is 0.943, and total annual wages and firm size is 0.9998. It is clear that all the correlation coefficients are greater than zero, implying that the relationships between each of the pairs of variables are positive. In addition, these coefficients are greater than 0.5, which implies that the nature of the relationships between each of the pairs of the variables is strong. Overall, the relationships that the variable pairs exhibit are strong and positive. Since they are strong relationships, it means that they are significant and can be used for conclusions.
Regression Analysis
Regression analysis is also useful in measuring the relationship between a dependent variable and one or more independent variables. The independent variable (s) is used to predict the value of the dependent variable using the regression equation (Hayslett & Murphy, 2014). One of the significant correlations obtained in correlation analysis is the one between revenue per employee and wages per employee. A regression line and equation for this correlation can be obtained from the corresponding scatter plot that was plotted earlier. Here is the plot of the regression line with a regression equation as well as R-Square value:
The equation for the regression line is y=9.4981x-163423, where y is the revenue per employee and x is the wages per employee. Therefore, as stated earlier, the dependent variable is revenue per employee and the independent variable is wage per employee. The increasing employee wage would always motivate them to work hard and their productivity will increase, hence increasing the firm revenue. The coefficient of x in the equation is 9.4981, which means that a dollar increase in wage per employee will being about 9.4981 dollar increase in revenue that a firm will get per employee. The intercept is a negative value and it is not meaningful in this case since there are no negative wages or revenue. The R Square value of the regression is 0.8887, which means that 88.87% of the revenue per employee is explained by the wage per employee.
The regression equation can be used to predict the dependent variable using other values of the independent variable. In this case, the revenue per employee can be predicted from a given wage per employee. The following are the predictions of a given amount of wages per employees:
Given the wages per employee is $50,000, then the revenue that the firm will obtain per employee will be:
y=9.4981*50,000-163423
=$311,482.50
Given the wages per employee is $30,000, then the revenue that the firm will obtain per employee will be:
y=9.4981*30,000-163423
=$121,520
Hypothesis Testing
In the regression analysis above, it is found that there is a correlation between the dependent and the independent variable. Hypothesis testing is used to test if the relationship is significant (Haspelmath, 2014). This test can be done on the coefficient, 1, of the independent variable. A linear regression is written as:
Y= 0 + 1 x
If there is enough evidence that 1 is zero, it means that that the correlation is insignificant, otherwise ,it is significant. The following is the hypothesis statement for the test:
Null hypothesis, H 0 : There is no significant correlation between wages and revenue per employee
Alternative hypothesis, H 1 : There is a significant correlation between wages and revenue per employee
Using the data analysis tool in excel, the part of the regression analysis output is as follows:
From the output, it can be seen that the significance F is very small, implying that the linear regression model is statistically significant. The p-value for the coefficient of wages per employee is also very small (less than 0.05). Therefore, we reject the null hypothesis at 5% level of significance and conclude that there is sufficient evidence to show that there is a significant correlation between wages and revenue per employee
References
Andersson, M., & Olofsson, P. (2013). Probability, statistics, and stochastic processes . Hoboken, N.J.: Wiley.
Haspelmath, M. (2014). Descriptive hypothesis testing is distinct from comparative hypothesis testing: Commentary on Davis, Gillon, and Matthewson. Language , 90 (4), e250-e257. doi: 10.1353/lan.2014.0071
Hayslett, H., & Murphy, P. (2014). Statistics . London: Elsevier Science.