Normality of Data and Z-Scores Free Essay Example

In statistics, normality and the normality tests are used in the determination of whether a set of data is well-modeled by a normal distribution and subsequently used to compute or calculate to what degree an underlying random variable of the data is normally distributed. In other words, a normality test ascertains if the sample under investigation has been drawn from a normally distributed population. Nonetheless, Ghasemi & Zahediasl (2012) defines normality as the ability of an analyst or a researcher to ascertain if a sample is drawn from a non-normal distribution. Normality of data, therefore, is an essential and crucial concept to a statistician. Whereas there are several techniques or tests to ascertain normality, they all serve the same purpose of detecting non-normality or deviation from normality. Primarily, most statistical tests depend upon the assumption of normality –this is the assumption that the sample data follows a normal or Gaussian distribution, or it is drawn from a normal population.

Furthermore, according to Lumley et al. (2002), normality of data is fundamental because it is pivotal in making inferences. Therefore, if the population from which the sample was drawn was non-normal, then the assumption is violated, and the data results become skewed or false. Non-normality, in other words, renders statistical tests inaccurate. Therefore, this emphasizes the paramount nature of knowing whether the data is normal or non-normal to avoid the adverse implications of an inaccurate test. Tests that depend on the assumption of normality are referred to as parametric tests.

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

On the other hand, if the sample data is non-normal, then non-parametric tests (tests that do not rely on the assumption of normality) should be used. Because non-parametric tests cannot ascertain the difference or variability in the sample data, they cannot offer significant and generalizable results compared to parametric tests that use normalized data. However, if a sample large enough is used, then the normality assumption can be violated because the central limit theorem maintains that approximately normal sample data means a normal sampling distribution, (Ghasemi & Zahediasl, 2012).

Z-Scores

The z-score also referred to as the standard score is the number of standard deviations from the mean data points. DeVries (2007) further elaborates by asserting that a z-score is a measure of how many standard deviations below or above the population mean a raw or random score is. According to Calculating Z-scores (n.d), Z-scores range from -3 standard deviations at the far left of the normal distribution curve to +3 standard deviations at the far right of the curve. Z-scores are salient because they proffer a platform to compare results from a test to a normal population. In other words, Z-scores provide a platform for determining or ascertaining the normality of data. Z-scores further aid researchers and statistician in the calculation of the probability of a score or data occurring within the normal distribution. Also, z-scores enable analysts to compare two scores from different normal distributions by standardizing the scores.

Mathematically, the z-score informs how many standard deviations are from the population mean. Therefore, the z-score not only provides information regarding the score but also its relativity to the normal distribution. Also, the z-score allows for statistical inferencing, especially for quantitative variables. As such, the z-score formula is represented as follows;

Therefore, if a specific attribute under investigation is three standard deviations above the mean, it means that the attribute is three times the average distance above the mean and probably represents one of the higher scores in the sample. In contrary, if the value of the attribute is -2 deviations from the mean, then the attribute is twice the average distance below the mean and represents one of the midrange values from the sample below the mean value.

In essence, while normality of data and z-scores are not mutually inclusive, they both serve a fundamental purpose of ensuring that the results of a test are accurate. As such, both of the concepts are important and should both be considered when conducting statistical analyses.

References

Calculating Z-scores. (n.d). Retrieved from https://www.wsfcs.k12.nc.us/cms/lib/NC01001395/Centricity/Domain/3165/calculating_z_scores.pdf

DeVries, J. (2007). About z-scores. The University of Guelph. Retrieved from https://atrium.lib.uoguelph.ca/xmlui/bitstream/handle/10214/1842/A_About_Z-Scores.pdf?sequence=7

Ghasemi, A., & Zahediasl, S. (2012). Normality tests for statistical analysis: a guide for non-statisticians. International journal of endocrinology and metabolism , 10 (2), 486-489. DOI: 10.5812/ijem.3505

Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of the normality assumption in large public health data sets. Annual review of public health , 23 (1), 151-169. DOI: 10.1146/annurev.publheath.23.100901.140546

Normality of Data and Z-Scores

Related essays

Scatter Diagram: How to Create a Scatter Plot in Excel

Calculating and Reporting Healthcare Statistics

Survival Rate for COVID-19 Patients: A Comparative Analysis

5 Types of Regression Models You Should Know

The Motion Picture Industry - A Comprehensive Overview

Spearman's Rank Correlation Coefficient (Spearman's Rho)

Running out of time?