Descriptive Statistics |
|
Mean | 1,059,381.3145 |
Standard deviation | 280,423.4483 |
Skewness | 0.3614 |
Minimum | 499,968.0000 |
First quartile, Q1 | 877,477.5750 |
Median | 1,035,749.2050 |
Third quartile, Q3 | 1,228,866.9600 |
Maximum | 1,746,600.0000 |
Interquartile Range, IQR (Third quartile,Q3 - First quartile, Q1) | 351,389.3850 |
The mean and median are the most common measures of central tendency. While the median is the most central number when data values are arranged in order of size, the mean is the average of all the data values. Consequently, the mean is significantly influenced by outliers making it less accurate when estimating the center of skewed data (Camm et al., 2018). The standard deviation measures the deviation of data values from the mean. The standard deviation for annual sales is 280,423.4483, which shows that data values are fairly spread from the mean. Similar to the standard deviation, the interquartile range estimates the spread of data from the median by calculating the distance between the first and third quartile (Camm et al., 2018). The interquartile range for annual sales is 351,389.385, implying that the data values are relatively spread from the median. Since the interquartile range is more than the standard deviation, it means that data values are more spread from the median as compared to the spread from the mean. Skew evaluates whether data is normally distributed. A skew of zero means that data is normally distributed, while a skew of more than 0 suggests that the data is positively skewed/skewed to the right (Camm et al., 2018). Similarly, a skewness of less than one means that the data is skewed to the left/negatively skewed. Since skewness is 0.36, it means that the annual sales data is slightly skewed to the right.
Delegate your assignment to our experts and they will do the rest.
Median
Maximum
Third quartile
Minimum First quartile
The box plot is a visual representation of the five-number summary. The median point cuts the data into approximately two equal parts. However, the area between the median and the maximum value is longer than the area between the minimum value and the median. The larger area means that there are more data values on the right side as compared to the left, meaning that the data is slightly skewed to the right (Zikmund et al., 2013). Since there are no data values located significantly away from the median, outliers are absent.
The histogram consists of bar graphs that representing the frequency of each class and assumes a bell-shaped curve when the data is normally distributed. In this case, the histogram assumes an approximately bell-shaped curve but there slightly more data values to the right, meaning that the data is slightly skewed to the right side (Zikmund et al., 2013). There are no data values located unreasonably far from the rest, meaning that there are no outliers.
Outliers refer to values that lie unreasonably away from the other values in a sample. Outliers lie below the lower fence or above the upper fence. The fences in a box plot represent the cut-offs within which reasonable data values are contained ( Engineering Statistics Handbook, n.d.). Outliers are present on the left side of the minimum value that is less than the lower fence, while an outlier on the upper side occurs if the maximum value is greater than the upper fence. According to Engineering Statistics Handbook, t he upper and lower fences are calculated using the formula:
Lower fence= Q1 – (IQ *1.5)
Upper fence = Q3 + (IQ *1.5)
Lower fence = 877,477.5750 – (1.5*351,389.3850) = 350,393.4975
Upper fence = 1,228,866.9600 + (1.5*351,389.3850) = 1,755,951.038
The lower fence 350,393.4975 < minimum, 499,968.00 meaning that the exists no outliers on the lower side. Also, the upper fence, 1,755,951.038 > maximum, 1,746,600.00 hence no outliers on the upper side.
Since the data is slightly skewed to the right, the median is a better estimate of the center, while the interquartile range is a better estimate of deviation. The mean involves averaging and would be influenced by more data values on the right side (Zikmund et al., 2013). Consequently, the mean would be higher than the median and, therefore, an inaccurate estimate of the center.
References
Camm, J. D., Cochran, J. J., Fry, M. J., Ohlmann, J. W., and Anderson, D. R. (2018). Essentials of business analytics. Cengage Learning.
Engineering Statistics Handbook. (n.d.). 7.1.6. What are outliers in the data? https://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm
Zikmund, W. G., Carr, J. C., & Griffin, M. (2013). Business Research Methods. Cengage Learning