I had a spirited discussion with a statistician earlier today on the subject of the normality assumption for one-way ANOVA. In the end, we both agreed that normality of the response variable doesn’t usually matter much, despite the fact that it is an assumption that is usually tested. The truth is, it takes a pretty radical departure from normality to make much of a difference in the p-value. And this departure has to be a particular kind of departure, namely skewness. Tails that are heavier or lighter than normal don’t make much difference if symmetry is maintained.
But I teach my students to test normality anyway. For one thing, many of my students go on to take certification exams or are enrolled in college courses. Since normality is one of the underlying assumptions of the F-test, it is technically a requirement and therefore one must teach it. However, after teaching that normality should be tested, I tell my students that departures from normality can often be handled by ignoring them, if they are not too great.
One thing I hadn’t considered was that I have been doing normality testing for all of the responses grouped together. My friend pointed out that if the treatment groups had different means this approach would result in failing the normality test, even though the within-treatment distribution might be normal. Fair enough, but we are proceeding on a null hypothesis that the treatment means are equal. Still, I’ll concede her this point and from here forward I’ll teach that when testing the normality of responses, one should look at the treatments separately.
Of course, normality of the residuals is another matter entirely. Here we do require normality and we treat departures from normality as a signal that our model needs to be improved. But when it comes to the normality of the responses, you can usually ignore it (in the real Six Sigma world,) or just remember it until you pass that exam.