Normality Tests

Finally! Data collection is done. Now, data analysis, correct? Yes, but do not be so eager to run that ANOVA or t-test just yet. It is wise to inspect the data for violations of normality. In a way we are “assessing whether a random sample of independent observations of size n come from a population with a normal N distribution (Razali & Wah, 2011, p. 21). Many statistical procedures, including t-tests, linear regression analysis, discriminant analysis, and Analysis of Variance (ANOVA) assume that the data have a normal distribution. As stated by Razali and Wah (2011), “when the normality assumption is violated, interpretation and inference may not be reliable or valid (p. 21)”. Ghasemi and Zahediasl (2012) go even further stating that “statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it” (p. 486). There are many procedures to assess normality which includes graphical methods (histograms, box-plots, Q-Q plots), numerical methods (skewness and kurtosis indices) and formal normality tests”.

We will begin by running the “Normality Test” available as part of the SPSS. The feature is available (somewhat hidden) under the “Explore dialog box”. The test will generate (if requested) a frequency distribution (histogram), the stem-and-leaf plot, normal and detrended normal Q-Q plots, and a boxplot. In addition, a table containing the descriptive statistics for the variable(s) is generated. One should exercise caution when relying on the graphical methods as this approach does not guarantee that the distribution is normal (Ghasemi & Zahediasl, 2012).

As part of the descriptive table, we will find the values of skewness and kurtosis, which will aid in testing for normality. Skewness and kurtosis are known as numerical approaches to testing for normality. Finally, SPSS will generate a table (Test of Normality Table) which contains results of two popular normality tests (formal normality tests). These are the Kolmogorov-Smirnov and the Shapiro-Wilk tests.

In the following paragraphs, we will describe some of these methods. Let’s begin by describing how to run the normality test in SPSS.

If you want to follow along, simply open “main1.sav” in SPSS. Select cases so that “sex = 1”. This will limit the analysis to “boys” only. Refer to Chapter 2 (Section 1) to learn how to select cases in SPSS. Next, click Analyze > Descriptive Statistics > Explore.


Next, click “Plots”. This will prompt the Explore: Plots dialog box. Select the options as they appear in the figure below and click “Continue”. Finally, click OK to run the analysis.


Histograms (Frequency Distributions)

Show-Me-How on YouTube

One of the graphs generated by SPSS is the histogram. Histograms can aid in testing for normality. With this method, observed scores (horizontal axis) are plot against their frequency (vertical axis). One can learn about the shape of the distribution, potential gaps in the data, and outliers (Ghasemi & Zahediasl, 2012). The graph below depicts a histogram for height values among 5-8 boys. Histograms can also be created elsewhere in SPSS. Please refer to Chapter 3 (Section 3) to learn how to create histograms using the “Chart Builder” procedure. The histogram below seems to indicate that the data are approximating normality as the curve resembles a bell shaped distribution. Still, we need to rely on other forms of graphs and mathematical procedures to confirm our suspicion.


Show-Me-How on YouTube

The stem-and-leaf method is similar to the histogram. The difference is that it depicts information about the actual data values, which can sometimes be helpful. Note that the plot has three columns (Frequency, Stem, and Leaf). At the bottom, we find the Stem Width. In our case, the value is 10. This means that the “Stem” column is all in the tenths place (10th unites). The Leaves are in the ones place. Thus, if we refer to the first number under the Stem column, we can learn that there are two observations in the Stem 10 (108, 109). Similarly, there is 1 observation under the Stem 13 (133). Notice that if one is to draw a line around the top numbers in the Leaf column and flip the plot horizontally to the left, the ending result will be a figure similar to the histogram above. The histogram and the stem-and-leaf plot can be used to infer about normality; however, the former is easier to interpret as it provides the curve which can be used as reference.

Q-Q Plots

SPSS generates two Q-Q plots, the normal Q-Q Plot and the Detrended Normal Q-Q Plot. We will focus on the former since it is commonly used and the easiest to interpret. The figure below shows the Q-Q Plot for height among boys; the same data used to generate the histogram in the previous page. Note that the circles are fairly close to the straight line, which indicates that the data are normally distributed . It is true that some of the circles are departing from the line, but is not enough reason to disqualify these data from being normal.

Now let’s take a look at the second Q-Q Plot – Normal Q-Q Plot of LS_1 (Borrowed from David Brown’s website – see reference at the end of this section). Unlike the first plot, the circles in this plot are not close to the line and have a “s” shape. This indicates some degree of skewing, which in turn prevents us from concluding that these data came from a normal distribution.

Show-Me-More on the WWW:


Show-Me-How on YouTube

The boxplot is another way to infer about normality. The horizontal line inside the box represents the median while the length of the box represents the interquartile range. The whiskers (line extending from the top and bottom of the box) represent the minimum and maximum values when they are within 1.5 times the interquartile range from either end of the box (Ghasemi & Zahediasl, 2012, p. 487). A score will be considered an outlier, and marked with a circle, when they are greater than 1.5 times the interquartile range. Those scores greater than 3 times the interquartile range are considered “extreme” scores and marked with an asterisks. But how does one infer about normality from a boxplot? According Ghasemi and Zahediasl (2012), “a boxplot that is symmetric with the median line at approximately the center of the box and with symmetric whiskers that are slightly longer than the subsections of the center box suggests that the data may have come from a normal distribution” (p. 487).

Skewness and Kurtosis

The methods used to infer about normality of a data set discussed up this point are graphical in nature, which in turn, requires a great deal of judgment from the researcher. Often, numerical methods are employed in conjunction with the graphical methods. Calculating the skewness and kurtosis is one of such numerical methods.

There values of skewness and kurtosis are readily available from the Descriptive table, which is part of the “Normality Test” output. Because the the absolute values of skewness and kurtosis are difficult to interpret, it is a common practice convert these values into z scores. This can be done by dividing the values of skewness and kurtosis by their respective Std. Error values (Zskew and Zkurt). In our case, the Zskew= .45 (.224/.501) and Zkurt= -.64 (-.620/.972).

To be considered within acceptable limits of skewness and kurtosis, the values of Zskew or Zkurt should not exceed +- 2.0. This further validates our suspicion that the “height” data come from a normal distribution, since the values of Zskew (.45) and Zkurt (-.64) fall within the acceptable range.

Formal Normality Tests

We will finalize our discussion addressing the Shapiro-Wilk test. Along with a number of other tests (i.e., Kolmogorov-Smirnov (KS) test, Lilliefors (LF) test and Anderson-Darling (AD) test), the Shapiro-Wilk test can add to the discussion of normality. The are known as formal normality tests. We will limit our discussion to the Shapiro-Wilk test as it is considered the most powerful normality test (see Razali and Wah, 2011 for a full explanation).

The table below indicates that the distribution of scores for “height” is not significantly different, statistically speaking, from a normal distribution. The calculated p-value (.627) is greater than .05. This supports the evidence gathered with the previous two methods (graphical and numerical methods). In other words, at this point we do not have evidence to disqualify this sample distribution as normal. The graphical, the numerical, and the formal normality methods employed are pointing to the same direction; that the sample distribution for height seems to be coming from a normal distribution.

When does normality become an issue?

Testing for normality becomes less important as the sample size increases. As stated by Ghasemi and Zah (2012), “with large sample sizes (> 30 or 40), the violation of the normality assumption should not cause major problems. (…) in large sample sizes, the sampling distribution tends to be normal, regardless of the shape of the data” (p. 486).

Works Cited

Ghasemi, A., & Zahediasl, S. (2012). Normality Tests for Statistical Analysis: A Guide for Non-Statisticians. International Journal of Endocrinology and Metabolism, 10(2), 486–489. doi:10.5812/ijem.3505

Razali, N. M., & Wah, Y. B. (2011). Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.

David E. Brown – BYU–Idaho. (n.d.). Retrieved September 7, 2014, from