Running head: DESCRIPTIVE STATISTICS 1
Descriptive Statistics
Variables can be classified into quantitative and qualitative variables. Qualitative variables are also referred to as categorical variables. Quantitative variables can further be classified into discrete and continuous variables. Continuous variables can take an infinite number of values, which are in decimal points while the discrete variables are limited to whole numbers. With qualitative variables, the values cannot be added, subtracted, multiplied, and divided while these operations can be conducted with the quantitative variables. Variables can further be classified into the following categories
Binary Variables
Binary variables involve the use of option with a "no" and "yes" responses (Baker, 2012). In the sheet, the variables for diabetes, allergies, family hx diabetes, and family hx allergies are binary variables.
Delegate your assignment to our experts and they will do the rest.
Categorical Variables
Categorical variables are variables used to identify qualitatively distinct categories such as sex but are assigned numerical representations such as male=0 and female=1 (Hoffmann, 2016). In the sheet, variables for education are a categorical variable showing distinct qualities in levels such as masters, high school, college, and professional.
Nominal Variables
Nominal variables can be understood to refer to variables with only two levels of measurement which do not take any quantitative measurement; the values cannot be ordered in any consistent format (Leech, Barrett & Morgan, 2013; Hoffmann, 2016). For example, in the datasheet, the variables for diabetes, allergies, family hx diabetes, and family hx allergies, are nominal variables. In these variable types, it is not possible to calculate arithmetic operations such as mean, mode, and median.
Ordinal Variables
Ordinal variables are variables assigned an ordered rank (O'Connell, 2006). From the datasheet, the ordinal variables are to feel depressed during the winter, to exercise during the summer and to overeat when stressed out. The values are assigned numerical values ranging from 1-5 which indicates the degree of the responses collected in the study. This is observed in the use of the Likert scale to measure attitudes and responses (Hoffmann, 2016).
Continuous Variables
The Body mass index is a continuous variable. The can be subdivided infinitely (Lee, Lee & Lee, 2000). The values assigned are numerical with the highest value being 30.2 and the least being 21.3. Other values are whole numbers for example 25.
Results Analysis
Table 1 results
Age |
Salary |
Height |
Weight |
BMI |
|
Mean |
50 |
$54,498 |
66.98333 |
159.1133 |
24.63867 |
Standard Deviation |
20 |
28923.78 |
3.750102 |
31.66517 |
2.231375 |
Median |
50 |
$50,012 |
67 |
161 |
25 |
Mode |
18 |
15000 |
65 |
145 |
25.4 |
Max |
91 |
$117,878 |
74 |
235 |
30.2 |
Min |
18 |
$10,123 |
60 |
110 |
21.3 |
Range |
73 |
$107,755 |
14 |
125 |
8.9 |
Analysis
-
Age
The descriptive statistics analysis for the variable age are presented in table 1 below
Table 2 age
age |
|
Mean |
50 |
Standard Deviation |
20 |
Median |
50 |
Mode |
18 |
Max |
91 |
Min |
18 |
Range |
73 |
The mean for the sample size is 50; the standard deviation is 20, the median 50, mode 18 and range is 73. The mean, median, and mode are measures of central tendencies that represent the area of concentration for the data sample. The mean and median values are 50, which indicate that the values are highly concentrated around 50. The mode of 18 shows that the most common age is 18 which indicate that there were more teenagers aged 18 in the sample size. The range between the maximum value of 91 and the minimum value of 18 is 73, which is very wide thus indicating that the sample age was widely distributed. The standard deviation provides a numerical impression on the concentration of the sample data around the mean. If a sample has a huge standard deviation, it reveals that the values are widely distributed while a small standard deviation means that the values are closely concentrated around the mean. In the age variable, the standard deviation is 20. This is a higher standard deviation since it will require up to three standard deviations to cover the maximum value of the age (91).
-
Salary
Table 3 salary
Salary |
|
Mean |
$54,498 |
Standard Deviation |
28923.78 |
Median |
$50,012 |
Mode |
15000 |
Max |
$117,878 |
Min |
$10,123 |
Range |
$107,755 |
The lowest salary is $10,123 while the highest earning salary is $117,878. The range is equally very high ($107,755) which is very close to the highest earning person. The range indicates the economic disparity between the rich and the poor. This is further augmented by the mode value of 15,000 which implies that most people earn $15000 which is close to the minimum value. The mean for the salaries of $54,498 is also a fact of the economic disparities between the classes; most people earn around $54,498 and $50,012. The standard deviation of 28923.78 is considerably very large indicating that the variables are widely distributed.
Height
Table 4 height
Height |
|
Mean |
66.98333 |
Standard Deviation |
3.750102 |
Median |
67 |
Mode |
65 |
Max |
74 |
Min |
60 |
Range |
14 |
The results for the mean are concentrated within a common value as represented by the mode, mean and median. The values for these central tendencies are 65, 66.98333 and 67 respectively. The values range between 65 and 67. The close concentration is also depicted in the values for range which is as low as 14. As compared to other variables, the range is very low, and the maximum and minimum values are 74 and 60 respectively, which are very close. More so, the standard deviation is very low (3.750102) which further augment the fact that the values are compact.
Weight
Table 5 weight
Weight |
|
Mean |
159.1133 |
Standard Deviation |
31.66517 |
Median |
161 |
Mode |
145 |
Max |
235 |
Min |
110 |
Range |
125 |
The mean, median, and mode for the weight are 159.1133, 161, 145. These values indicate that the majority of the participants weighed between 145 and 160. However, there are extremes which extend to as high as 235 and 110. The range between the minimum and the maximum values for the height is equally high (110). Comparing the standard deviation and the means, the values are closely located as indicated by the lower standard value as compared to the mean value.
BMI
Table 6 BMI
BMI |
|
Mean |
24.63867 |
Standard Deviation |
2.231375 |
Median |
25 |
Mode |
25.4 |
Max |
30.2 |
Min |
21.3 |
Range |
8.9 |
The standard deviation for the BMI is very low (2.231375) which is an indication that the values are highly concentrated around the mean (24.63867). The measures for central tendency (mean, mode and median) 24.63867, 25.4, and 25 are also located around 25, which emphasizes the nature of the data being uniform. The range between the minimum and the maximum value (21.3 and 30.2) is also low (8.9) which also evidence of the uniform nature of the BMI data sample.
Conclusion
The data sheet presents both qualitative and quantitative data. The age, salary, height, weight, and BMI are quantitative variables. The standard deviations for age (20), and salary (28923.78) are relatively large which is an indication of the wide distribution of the values. The averages for the height (66.98333) and BMI (24.63867) are low which indicates that the values for these variables are compact. These facts are further augmented by the maximum, minimum and range values for the respective variables.
On the other hand, the values of are qualitative variables that are represented by a "yes" or a "no." These variables were education, diabetes, allergies, family hx diabetes, family hx allergies, to feel depressed during the winter, to exercise during the summer, and to overeat when stressed out. This data set presents a complete research result that covers both quantitative and qualitative results. The complete analysis of the data can generate useful information for a more constructive conclusion.
References
Baker, K. R. (2012). Optimization modeling with spreadsheets. John Wiley & Sons.
Hoffmann, J. P. (2016). Regression models for categorical, count, and related variables: An applied approach. Univ of California Press.
Lee, C. F., Lee, J. C., & Lee, A. C. (2000). Statistics for business and financial economics (Vol. 1, p. 712 ). Singapore: World Scientific.
Leech, N., Barrett, K., & Morgan, G. A. (2013). SPSS for intermediate statistics: Use and interpretation. Routledge.
O'Connell, A. A. (2006). Logistic regression models for ordinal response variables (No. 146 ). Sage.