12 Dec 2022

123

Estimating Models Using Dummy Variables

Format: APA

Academic level: Ph.D.

Paper type: Essay (Any Type)

Words: 553

Pages: 2

Downloads: 0

Dummy variables in regression analysis takes values of 1 or 0 which indicates a presence or absence of some of the categorical effect that may be potentially shift the outcome of the dependent variable (Vogt, 2006). Most variables used in dummy regression have mutually exclusive categories indicating that if a particular categorized event happens, the other one does not (Fox, 2015). For example, smoking as a dummy variable can be categorized as smoker (1) and non-smoker (0) – and this tells that, if a respondent is a smoker, then he or she cannot fit in the non-smoking category. One potential study that can be carried out from the GSS14_student _8210 data set is assessing the effect of respondents’ sex, citizenship, and age on their income. Based on these variables, the study question can be stated as follows:

Does sex, citizenship, and age significantly predict personal income? 

Description of the study variables

Age, sex, and citizenship are independent variables in this case and are hypothesized to influence personal income. Therefore, the income is the dependent variable and is measured in a ratio scale. Further, age is a measured in ratio scale while sex and citizenship are dummy variables measured in a categorical scale (Warner, 2008). Notably, sex is coded as 1 (male) and 0 (otherwise) while citizenship is coded as 1 (US Citizenship) and 0 (otherwise).

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

Findings

The model was statistically significant at 5% significance level, F (3, 207) = 11.197, p < 0.05. This tells that the model’s R is statistically significant in that age, sex, and citizenship correlate with the shifts in the personal income. The R-squared is 0.140 indicates that 14% of the model variation or changes in personal income can be explained by the changes in the included predictors (Warner, 2008).

Table 1

Regression Model 

Table 2

ANOVA Summary for the Model 

The model constant was 44,841.428 and this indicates the average amount of income that is not reliant on personal age, sex, and citizenship. Therefore, on average the respondents’ personal earnings were $44,841.43 irrespective of their age, sex, and citizenship. On the other hand, citizenship and sex were negatively correlated with personal income while age was positively correlated with the respondent’s income. Notably, other things being equal, a US citizen earns $11,275.29 less than the other non-US citizens considered in the study (indicated by a negative sign on the coefficient) (t = -3.275, p < 0.05). Similarly, males earn $12,723.43 less than females (considered as otherwise in the study) (t = -2.998, p < 0.05). Lastly, the age had a coefficient of 536.930 and this indicates that a year older for the respondents increased their income by $536.93, ceteris paribus (t = 3.191, p < 0.05). The positive sign associated with the beta for age indicates a positive correlation (Warner, 2008).

Table 3

Model Coefficients 

With this in mind, the model can be written as follows:

Income = 44,841.428 – 11,275.29*Citizenship – 12,723.43*Sex + 536.93*Age 

Based on the above model, for a female, US citizen aged 25 years would have an average of

Income = 44,841.428 – 11,275.29*(1) – 12,723.43*(0) + 536.93*(25)

= 44,841.428 – 11,275.29 + 13,423.25

= $46,989.39

Model Diagnostics 

Multiple regression model makes four basic assumptions. First, it assumes that the predictors and the dependent variable have a linear association and this can be determined using the correlation coefficients (Lewis-Beck & Lewis-Beck, 2015). It is clear that the correlation coefficients between income and the three predictors show significant correlations which prove the presence of linear association assumption was met. Second, the model assumes that the predictor variables are highly correlated and this can be shown in the correlation matrix and the basic rule of thumb is that no pair of the independent variables should have a correlation of above 0.80 (Lewis-Beck & Lewis-Beck, 2015). This assumption was met since all the independent variable pairs had correlations weak correlations less than 0.50. Lastly, the variance of error terms assumption (homoscedasticity) was met too – all the variances were statistically the same - and this tells that no remedy is required as all the assumptions were met.

References

Fox, J. (2015). Applied regression analysis and generalized linear models . Sage Publications.

Lewis-Beck, C., & Lewis-Beck, M. (2015). Applied regression: An introduction (Vol. 22). Sage publications.

Warner, R. M. (2008). Applied statistics: From bivariate through multivariate techniques . Sage.

Vogt, W. P. (2006). Quantitative research methods for professionals in education and other fields. Columbus, OH: Allyn & Bacon .

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 14). Estimating Models Using Dummy Variables.
https://studybounty.com/estimating-models-using-dummy-variables-essay

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

17 Sep 2023
Statistics

Scatter Diagram: How to Create a Scatter Plot in Excel

Trends in statistical data are interpreted using scatter diagrams. A scatter diagram presents each data point in two coordinates. The first point of data representation is done in correlation to the x-axis while the...

Words: 317

Pages: 2

Views: 186

17 Sep 2023
Statistics

Calculating and Reporting Healthcare Statistics

10\. The denominator is usually calculated using the formula: No. of available beds x No. of days 50 bed x 1 day =50 11\. Percentage Occupancy is calculated as: = =86.0% 12\. Percentage Occupancy is calculated...

Words: 133

Pages: 1

Views: 150

17 Sep 2023
Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

Null: There is no difference in the survival rate of COVID-19 patients in tropical countries compared to temperate countries. Alternative: There is a difference in the survival rate of COVID-19 patients in tropical...

Words: 255

Pages: 1

Views: 250

17 Sep 2023
Statistics

5 Types of Regression Models You Should Know

Theobald et al. (2019) explore the appropriateness of various types of regression models. Despite the importance of regression in testing hypotheses, the authors were concerned that linear regression is used without...

Words: 543

Pages: 2

Views: 174

17 Sep 2023
Statistics

The Motion Picture Industry - A Comprehensive Overview

The motion picture industry is among some of the best performing industries in the country. Having over fifty major films produced each year with different performances, it is necessary to determine the success of a...

Words: 464

Pages: 2

Views: 85

17 Sep 2023
Statistics

Spearman's Rank Correlation Coefficient (Spearman's Rho)

The Spearman’s rank coefficient, sometimes called Spearman’s rho is widely used in statistics. It is a nonparametric concept used to measure statistical dependence between two variables. It employs the use of a...

Words: 590

Pages: 2

Views: 308

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration