Article Revised: April 26, 2019
The Laney p’ Control Chart is an exciting innovation in statistical process control (SPC). The classic control charts for attributes data (p-charts, u-charts, etc.) are based on assumptions about the underlying distribution of their data (binomial or Poisson). Inherent in those assumptions is the further assumption that the “parameter” (mean) of the distribution is constant over time. In real applications, this is not always true (some days it rains and some days it does not). This is especially noticeable when the subgroup sizes are very large. Until now, the solution has been to treat the observations as variables in an individual’s chart. Unfortunately, this produces flat control limits even if the subgroup sizes vary. David B. Laney developed an innovative approach to this situation which has come to be known as the Laney p’ chart (p-prime chart.) It is a universal technique that is applicable whether the parameter is stable or not.
About Your Presenter, David B. Laney
David B. Laney worked for 33 years at BellSouth as Directory of Statistical Methodology. He is a pioneer at BellSouth in TQM, DOE, and Six Sigma. David’s p-prime chart is an innovation that is being used in a wide variety of areas. It is now included in many statistical applications, such as Minitab and SigmaXL. David is enjoying retirement with his family in the Birmingham, Alabama area.
“So now let me introduce David Lainey. David is our presenter today. David and I go way back, we’ve been friends for 25 years plus back in the days when dr. Deming was still roaming the earth and helping us all understand continuous improvement. David is a former statistician with BellSouth corporation for 33 years. He invented a new way of analyzing attribute data which is now known as the Laney control chart. At this point I’m going to turn over the presentation to David and let him tell you all about his new control chart.
Okay thanks Tom. Can you hear me? Sure can. Okay first thank you for the opportunity to participate in this, I’m real excited about this still. It was the last thing I did before retiring, in fact, the month I retired from BellSouth the paper came out and in quality engineering. So kind of a parting shot.
Before getting into the presentation of what this is and how it works I thought I’d just give a quick… by the way what you’re seeing on the screen is my previous mode of transportation and also how I paid for it. The last few years of my working career I never took any real vacations. Whether the assignments teaching to other companies how to do this. What else would a statistician do on vacation but teach statistics right? There you go. Anyway in about 1990 I guess we started in BellSouth into TQM in a big way. I was trying to teach everybody how to do control charts. Here’s what you do if you’re looking at percentage errors and here’s what you should do if you’re looking at time to repair whatever. An application came in that sort of baffled us. It was looking at monthly data of emergency 9-1-1 calls in the state of Florida. Now the state of Florida, as you might guess, has probably a disproportionate number of calls to 9-1-1. There’s quite a few folks there who all used to live in Manhattan who are somewhat elderly now intend to need 911 more than most. Anyway we were looking at the percentage of such calls that could not go through. A rather concerning statistic, needless to say. Trying to make the system work better. Everything was fine, we put it but the date on the P chart of course and there was one problem; the control limits seemed to be rather narrow. That’s a matter of the perspective of the graph I guess, but the data was all over the place, every point was out of control. That didn’t make a lot of sense, how can everything be out of control. We realized rather quickly that the denominators in this case were in the tens or hundreds of thousands. Naturally any P times one minus P over N, that for the N that P is going to be microscopic. So that the Sigma times three added and subtracted to the center line is going to be like plus or minus nothing. The data couldn’t possibly fit inside that. Well I was still kind of new to control charts at that point. One looked it up in the Western Electric or AT&T handbook and it said, well this is nothing we’ve seen this before just put the data in an individual’s chart. The next chart or IMR chart if you’re a Minitab user. We did and everything worked out fine, except, then we realize that the sample size is month by month by month were very different. In summer months, it’s real hot in Florida. There were way more calls to 9-1-1 than during the winter months and in a P chart we’ve gotten used to seeing the control limits, in such a case wiggle. With large subgroup sizes you expect the limits to be narrow. With small subgroup sizes you expect them to be wide. Can’t do that in an individual’s chart, it is what it is. So puzzled over that and that was what led to the initially worrying about this. Shortly thereafter I happened to be sitting in Knoxville Tennessee in a conference room with Don Wheeler. Taking his, I forget, I’ve taken so many courses under him I forget which one it was but he was talking about analysis of variants. At the moment of migrate awakening and he was reiterating the bit about between group variation and within group variation and how you got a look at the two. That’s when it hit me. What we were dealing with in that 911 problem was the difference between within group or subgroup variation and that’s what the P chart measures. Versus between or the real. I’ve always preferred the word among for more than two. Among group or subgroup variation. See things in the real world don’t tend to have a constant P bar. Proportion errors, proportion of anything. Some days it rains and some days it doesn’t. If the existence of rain affects what you’re measuring and in the case of the telephone business especially outside plant operations it certainly does. Weather is a big factor in where things break down. So the very assumption behind using a P chart that is there is in fact a constant unchanging attribute P of capital p bar up there in the sky. We’re going to try to measure it with our sample P bar and we’re going to try to put an interval around that called a control interval. We violated the first assumption. We know does it make sense to have a constant P bar. There’s going to be some way to reach in there and pull out the between subgroup variation but not to the point of using the total variation, the mistake most engineers make when they first learn control charts. They want to use the root mean square Sigma. We know in our study of quality that Xu Heart taught us that short-run variation is the key.
Anyway, here’s how it started. Let’s get into the presentation itself. This particular presentation was originally written for the ASQ’s third annual Six Sigma forum round table in New Orleans. Fortunately prior to Katrina there was still a place to meet. I gave this one a dispersion of it again a little later in Atlanta.
P charts and U charts, that’s the inversion. The P chart measures percent errors, the U chart measures the number of defects. There can be more defects per unit. A U value can be greater than one, try to remember that. But if there are times when we see too many false alarms, we see… we’re going to talk during this presentation. But when this happens, why it happens, what the traditional remedies have been and perhaps a better way.
If you take what I was saying while waving my arms a moment ago now putting it on paper, this is what I was saying. This is the formula, the binomial assumptions formula for the standard deviation of proportions. It’s the average proportion times one minus the average proportion divided by and if there’s a purist in the crowd I know it’s really in minus one. But for numbers this size it doesn’t matter. Anyway, square root of that. Plus or minus three of those is where we told to put the control limits in a P chart. If P bar really is constant over time, there’s not a thing wrong with this.
But this is what it can sometimes look like. I think these are the actual data I’m going to be using in an exercise a little later. But this happens a lot. I’ve seen it happen, it happens again in the telephone company. But I’ve had a lot of colleagues in the healthcare industry report the same problem. Where their sample sizes are number of patients and the thing they’re measuring is something that goes wrong for this patient. They were not treated properly, they received the wrong dose or they acquired a hospital in born infection or something like horrible like that. But this is a picture of what has become known as over dispersion. The data looked like they just couldn’t possibly have come from the underlying binomial model and that’s because they don’t.
The binomial assumption of a fixed constant parameter about which our samples vary is invalid. The parameter itself, the target is moving. There’s common cause variation here in the parameter itself. That can’t be explained by that P times 1 minus P over N. As I say the Western Electric could properly entitled AT&T handbook gave us a solution for that.
Let’s just forget that the data are P values just treat them as a single observation. At this time I saw this number. Do this formula to them. Take the ranges of successive pairs X I and X I minus 1. Take those ranges, absolute value, average them and divide by the scale factor of 1 point 1 2 8. That’s 3 divided by two point six six I think. Anyway that gives us an explanation a very common commonly used type of control chart called individuals or x mr x and mr i mr goes by a lot of names.
The data we just saw when subjected to the individuals chart looks like this. Now this looks like a reasonable control chart. One point here looks like it might be getting a little bit squirrely but all in all this doesn’t seem to be any cause for alarm. I don’t see any shifts, I don’t see any trends. But its a totally different picture than what we saw before. But again, Nick rather than stop there which is what people literally did from the 1920s until the 1990s. I’m worried about this Wiggly control limits thing. What if just what if in this fourth point, what if that was measured for a very large subgroup. A large sub group would tend to have narrower control limits than most. That point might then be seen to be out of control. So there’s a reason why you come to want wiggly control limits they better represent the contribution of each point to the overall problem.
So I say, while thinking about this I was listening to wheeler draw on and on and what he was saying it started to get to me. All this… I’m sorry I should have got shown this. But that point for it was in fact a lot your subgroup and so we need to take that into account.
Well one of the things in their toolkit of control charts is something rarely used called the Z chart. It’s actually presented in many books as a way of eliminating the wiggly control limits. Just if you don’t want to see those. Just do a simple Z transformation. Put everything in the Z plane, which is centered and scaled. The minus P bar puts the data centered at zero and scaling with the standard deviation gives you something that in a binomial data set would be a standard deviation of one. So if the P variable has a mean of this, P bar, and a standard deviation of Sigma then the Z variable will have a mean of 0 and a standard deviation of 1 by definition. Well that being the case then if you did this transformation to your data and then want to chart it well what’s 3 times 1? It’s just 3. So by convention everybody just draws the control limits at plus 3 and minus 3 the center line of course is 0. There’s this Z chart on the data we have here. Whoops it didn’t help. Of course it didn’t help because we’re still making that stupid assumption that there’s no variation here other than the binomial which is not true.
So here’s the phrase I wanted to… a little nod to Don here. Plagiarizing his book cover. He said and we’ll say to you if you ever ask, “Why assume the variation when you can measure it?” He says that a lot. He’s a big proponent of use of individuals charts in many many cases.
What that means is taking this formula again for the Z chart. Instead of just blindly assuming that Sigma is one, why don’t we measure and see what it is actually found now. When we do that we get this on our data set and now it looks normal again. We do see… remember that each of these points now has been corrected for its own standard deviation based on its own sample size. So when we do that we see that our friend .4 here is in fact still inside the control limits. No I’m not one to quibble about on how close something is to a lot of a limit like that, but just for illustration. Well we’ve arrived at something that was it’s really a breakthrough. So much so to the point when I was in another class somewhat after that in Austin Texas with Tom’s favorite competitor, Forest Bry Fogel, I showed it to him. He liked it so much he put it in his book Implementing Six Sigma. He put this version of the solution in his book, he never carried it any further than this. They said you ought to do this when you see the problem of over dispersion. This takes care of it and in fact it does. But I was not satisfied to stay here because in the telephone company like so many others, the people you work for are not engineers. Sound familiar? Many of them actually smile when they say I was never very good at math. Well they didn’t know how to interpret this. What’s a Z? How many Z’s does it take to get a car started or change a light bulb? They didn’t understand the basic
interpretation of what these points are.
So by means of a little bit of algebra in the usual case we have this for a P chart. P is… actually this is turning the Z transformation formula around. Remember Z was; P sub I minus P bar over Sigma. I’ve just solved that equation for P sub I here. If you take the standard deviation of this line this is a constant doesn’t enter into it, this is a constant so it just flows right through and so you’re left with the standard deviation of Z the Z scores. Now that’s the thing, that the Western Electric handbook told us was by definition equal to one. But which we now have subjected to the wheeler test which says; don’t assume the variation measure it. So if you have your data converted to z-scores and you measure the variation by means of the shoe heart approach of short term average moving ranges of size two. In the case we’re looking at I think this number was five. The data we were looking at I’m pretty sure. So putting that in place in the new formula and I called this a P prime chart… I didn’t know what else to call I wanted it to show the DNA of the P chart of the mutation showing the prime. So the new upper control limit will now be the previous components P bar plus 3 Sigma sub P sub I times this Sigma sub Z. So you see there you can you can come up with finally with an interpretation of what Sigma Z actually is. It is the relative amount of variation, not counted for by the binomial. So if that number is 5 that means there’s 5 turn is more variation in your data than the binomial assumption can find and explain.
Putting it in a chart, there we go final product, there’s our data. Sure enough you notice that our favorite point for the control limits are fairly narrow right there. So it was a good thing that we added the wiggly control limits it gives us more power to the test.
Now it turns out if you go back to this right here that bottom formula. If you have a data set, real or contrived, that really is binomially distributed… you went to many tab and you said give me a binomially distributed data set with this parameter. You know do that with several different sample sizes and whatever. But if it really is binomial and distributed it becomes identical to the P chart. Sigma Z really is empirically very close to one, just differs by a sampling error.
If you have a data set of any distribution doesn’t matter. But if you’re any if in your data set the ends the sample sizes are all the same. Then what you do here will be algebraically identical to the individuals chart they’re one in the same. So it works at the fringes. It collapses to the P chart if you don’t really need it. It collapses to the individuals chart if you didn’t really need to worry about the Wiggly limits.
Now there’s more on this subject you can find out there are stuff that preceded my work. In this particular book of wheelers he introduced something called chunky ratios, that he’s pretty well still satisfied with and good reason he should be.
What he does; he takes the standard individuals chart formula, just like before, except the ads on this last term for no apparent reason he just throws it in there. What this is the average sample size over the particular sample size for this particular point in this ratio and with a square root because we know the variation is inversely proportional to the square root of the sample size. So look at what this means. If you’re talking about a point at which n is larger than average, this term is going to be smaller than one. If it’s a subgroup who is that is larger than average… no what did I say? Smaller, if it’s smaller than average this term is going to be greater than one. It’s going to make the limits… I’ll start over. I’m getting that self-confused. If n is relatively large if n is relatively large this term will diminish the control interval, if n is relatively small it will expand it.
Doing that to this data, using Wheeler’s method, gives us this. I can click back like that. It just so happens that this one point does slightly go out of control using his method. It’s not often that happens. I’ve done this many many many many many times with lots of datasets and more often than not these are very close. The reason they’re not the same is because he’s giving the same weight every single subgroup regardless of size so there’s – there’s just unquestionably some bias in there. But in the old days of doing everything by hand or in Excel or whatever it certainly was easier computationally to just do this than to go through all that other stuff we had to go through.
Other research has been done. One student I met at the University of Alabama some time ago, I gave her class my presentation on this method and she liked so much she plowed into it for doctoral dissertation. Naturally being a PhD she went to his Bayesian approach not a new Bayes so we’re not on speaking terms but I still bring out every now and then and try to understand but just have a good laugh and put it back on the shelf.
To summarize we know from experience that P charts and U charts can be long and too many false alarms. We know why it happens because the binomial Poisson distribution, which is the underlying assumption for the u chart is not right. The parameters just not constant. By the way there is such a thing as under dispersion. There is such a thing as having as a Sigma sub Z that’s less than 1. What that means is that you have positive serial correlation, autocorrelation, in the data. Taking my case of outside measurements, if you’ve ever noticed weather patterns, rain, for those of us not in Arizona. Those of us who know what rain is. If you look at a history over the last several weeks, months, years, you’ll see that rain patterns occur in clumps. You’ll have three days of rain, then four days of Sun and then two days of… so each subgroup is somewhat has a memory of the previous subgroup. It looks… the old saying in the weather business I understand is; the best predictor ever made is tomorrow’s weather will be like today’s. So there is such thing just so you’ll know. P prime charts, chunky ratios, maybe this Bayesian approach are obviously better ways. I threw this in. You do learn when you work with PhDs that you always have to throw this in. More research is needed because that gets you more funding later on but also it covers you if somebody later comes up with something better than yours. Well I always said more research was needed right? Okay, so let’s see… moving on.
I wanted to show you this in action. Here’s the data set that I used in an article and quality Digest magazine when it was… honestly don’t remember when it was. Some years ago. We have fairly large subgroups. It’s this phenomenon of over dispersion can exist at any data set but it’s more clearly visible when there’s large large sample sizes. Because that’s the situation under which the within subgroup variation just goes away and leaves exposed the between subgroup variation and that’s when you see it. So these are pretty big you know maybe not free 9-1-1 calls in Florida but they’re a good size. Here are the number of defects. Now doing that in a classical P chart looks like this see the problem. Doing in a classical X chart we get that, better. But then when you do it to a full blown P chart it looks like this. Here’s my email address there, David Delaney at yahoo.com send me an email I’ll send you this file if you want to have it. It’ll show you certainly better than I can now by just clicking on something and reading the formulas you can see what I did. You can use this program with whatever modifications you wish to come up with your own problems and some definitions here that might help you.
I believe in this case… yeah this is the case where Sigma Z is equal to five. So these limits are five times farther away from the center line than they were over here in the P chart. Until recently this is where I would have had to stop, but now there’s more. Effective next month. That’s the Minitab… where is that? Why am I not seeing this? Tom am I still in presenting mode? Uh yeah I still see you. Okay alone. Oh you covered that up that’s what happened. The little software covered up something I already had there. The software has a will of its own as you know. I can see that. Okay well we’ll go after it this way. This is a backdoor minitab folks gave me to get into release 16.2. It’s coming out next month sometime. And I do have some questions pending David but I’m going to… I’ll wait till you’re done. Okay thanks. Hopefully this is masked. Yeah, yeah it’s okay. This is carrying me to a web version of Minitab 16.2.
There it is. There’s that same data, it’s a subject where you can see it. I know different people have different screen resolutions and things on. I know I can’t predict what you’re seeing but there are 20 data points. I’ll speak for the rest of the people, I can see it just fine and I don’t have a zoom in a window of a hundred percent and it’s showing your spring perfectly. Good, thank you. Well here’s the same data that we were looking at in the excel case. If you’re a Minitab user you’ll find this procedure fairly common. Stat, control charts, attributes charts, P chart or now this is going to look a little different than what you’re used to. But we would normally pick a P chart and I’ve already told it that column two is the variable and column one is the subgroup size. So there’s the P chart. In many tabs way of doing it. You can see that there’s a problem here. It’s flagging more points than it’s not flagging. I think the only test I have turned on here is the point outside controlling. Tom taught me to not overanalyze these things. So for many years I’ve adopted the habit of only really looking at three tests. What are their eight different western electric tests or something? Yeah they keep adding more. If yu test it enough it’s going to fail. Exactly right. Tom is the one who put me onto that. That if you did all eight western electric tests you would have a false alarm rate that’s just totally unacceptable. Now he can tell you what that is but it’s huge. It’s like 20%. Anyway you only use the point outside control limits 8 points are on the same side of the mean not nine minitab and I think six points that are rolling a trend. Anyway this is not acceptable. This just can’t happen. So go back to 1956 or what it was and take the advice of Western Electric. Say control charts, individuals and by the way is another little hint I never do this, undo this individuals, just show the X chart. Dr. wheeler says the only legitimate reason for looking at the range chart is to prove that you did it right. That you really did use moving ranges of size two in order to get the correct short-run Sigma. The actual information or knowledge imparted by the range chart is exactly the same as that shown by the individuals chart. You really don’t need it so I don’t clutter things up. Here the variable is P. P itself. Says just a number observed at a time. There it is that looks good. Unless of course you’re dealing with one of those math haters that can’t figure out what a Z is. So now drumroll… we can go to control charts, attributes, laney P prime, column two is the variable, column 1 is the sample size and there it is. Let’s get these two up side by side. Okay. You may not be able… I don’t know how wide a screen you’re using. I’ll try to cheat them close together. Make sure but I can see both charts. But there it is the old and new.
I’ll close by leaving it on that screen, which is what I’ve been doing since I retired from Bell South. There you go. I’ve had somebody with your jeans get to be so cute David? It comes from his mother’s side. Uh I see. She is from New Zealand and those two gene pools haven’t touched each other in 500 years. There you go. Yeah well he’s a cutie that’s for sure. So I am going to take control back and thank you very much David. I’ve got as I say a couple questions and if anyone else would like to ask now it’s a good time. I’m going to do something that’s extremely risky and that is I’m going to show my webcam. So best wishes everyone. So you take a look at me in my office and I’m learning how to use chroma key so the next time I might be you know on an alien planet. But this is this is the real world right now.
So David one of the questions that I have is from Matt. Matt asks, “So what does Wheeler think of your work?” He approves it. He says it’s right but in typical Wheeler fashion, he says what the heck Chucky’s ratios works. Certainly you know one of the reasons for using the Wheeler approach being that it’s simpler computationally it’s going to go away when you get this into Minitab for everyone who has Minitab. You know I don’t think Don even uses Minitab. I could be wrong. But yeah he didn’t at the time I was studying under him because it was a long time ago but I hate to put words in his mouth but I would be willing to bet that dong likes to have more hands-on control of what he’s doing. Yeah he probably… I would think he uses Excel. I think he uses a slide rule and pencil and paper. I got two degrees at Georgia Tech without a computer, okay. Did he did he come up with chunky ratios after the P prime chart? No. No, before. Okay so I thought perhaps you inspired him to do it. It’s hard to drive Don off of the process behavior chart, which is what he calls the P chart. I’m really not sure now that you mention it I did I had not heard of the chunky ratios approach. He answered me as I was asking him about mine. He sent me a answer that referred me to the chunky ratios. You know it’s all blurred together in there in the late 90s that all this was happening. Sure. Definitely before. There are other software outfits. Sigma Excel has had this chart for quite a while. Mm-hmm. Word gets out and people have correspondence with me saying… like the Sigma Excel people who’s real big on this method. Yeah Sigma Excel has a nice package, some of my students have that package. Starting this year we’ve been exclusively to Minitab in terms of what we provide for the students but in past years we provided both Sigma Excel and Minitab. In my extracurricular teaching we would always have to do it both ways. Even when Minitab was very readily available classes like its southern tech where we were taking people literally off the street they were not our corporate people. You know they were just students so you always have to show how to do it in Excel because not everybody could afford Minitab. Yeah that’s delicate, it’s a pricey package. Yeah. So another question… what you have to do is you have to have to design a new tool and then they’ll give you a copy for free. That’s true that’s true or making a requirement for your students that gets you a freebie as well.
So another question, this one’s a little more technical from Mark. Mark says, “Following up on the idea of measuring rather than assuming isn’t your approach simply reverse engineering the CPC CPK of each data set and plotting it?” It’s a very involved question. Can you parse that or should you answer Mark. Sounds about right. But there again in doing CPC PK you still have I would think still have the dilemma of five what method do you arrive at a Sigma. What is Sigma? Right and CPC CPK capability ratios you know which is a comparison for the requirement which is not part of your method at all. No no no, requirement that’s the customers responsibility. But even in coming up with the curve that you use to say what CPU CPU K is you still got to put in a sigma. Which Sigma? Sigma sub P. No I would use Sigma sub P times Sigma sub Z because that’s the total package. That’s the with in variants and the between. I forgot to say this earlier what my method does it actually takes a book by Fisher and a book by Xu Hart and slams them together. It creates an offspring. Fisher being keen on the analysis of variants and F tests and all the between variation. It’s been ignored totally in control charts. It’s true. Yeah and yeah that’s a good point.
I pasted your email address into the chat window so if anyone would like to correspond with you by email you know it’s there. You flashed it on the screen so I felt you’re giving me means to do it. I have time between grandchildren. There you go, there you go. Are there any other questions? So as they seem done, I’m going to conclude this webinar. I want to thank David for a great presentation on an extremely interesting topic and we look forward to seeing that in the next version of Minitab. I want to thank everyone for attending the webinar and look forward to seeing you at future webinars. This is being recorded. I’ll post the recording link on my blog, as well as on the student forum in my training class so you’ll be able to come back and look at this presentation in the future. You should all have a link to David slides so you’ll be able to look at those in the future as well and again thanks everyone for attending and we’ll see you at the next webinar. Thanks Tom”