Statistically analyzing covid-19 vaccine phase III results

Relative risk reduction and absolute risk reduction measures in the evaluation of clinical trial data are poorly understood by health professionals and the public. The absence of reported absolute risk reduction in COVID-19 vaccine clinical trials can lead to outcome reporting bias that affects the interpretation of vaccine efficacy. When I do Lean Six Sigma Training I go to great lengths to explain how to get valid results and how to interpret these results. It’s not easy. It takes several lessons that include slides, videos, assignments, discussion and so on. Sadly, when reporting on something as important as vaccines, this level of attention to proper interpretation of results is not done.

The big bugaboo is the difference between relative risk and absolute risk. The concept is simple enough: relative risk (RR) compares two risks, while absolute risk (AR) compares the reduction (or increase) in risk. To decide what to do a person (individual, physician, public health official, politician) needs to know both risks. The figure shows why merely reporting RR is not sufficient: here a 50% reduction in RR equates to a 1% reduction in absolute risk. But even this does not tell the whole story. Consider the contingency table below, a 2 × 2 contingency table for SARS-CoV-2 infection in vaccine clinical trials. These are actual data.

Infection No infection Sum
Vaccine a b a+b
Placebo c d c+d

Here are the contingency tables for the Pfzier/BioNTech and Moderna phase III trials.


Infection No infection Sum
BNT162b2 8 21,712 21,720
Placebo 162 21,564 21,726

And here are those for the Moderna phase III trials

Infection No infection Sum
mRNA-1273 11 15,199 15,210
Placebo 185 15,025 15,210

With these data the following questions can be answered

  • What was the efficacy of the Pfizier/BioNTech vaccine? Answer: using Pfizier/BioNTech table data a/(a+c)=95%
  • What was the efficacy of the Moderna vaccine? Using the Moderna table a/(a+c)=94%.

The are the only numbers reported by the vast majority of news outlets. But there are a great deal many more questions that could be answered using only these phase III trial data.

  • What was the infection rate at the time these trials were being conducted? Answer: combining both tables of data 366/73,866=0.50%. Or 5 infections per 1,000 people.
  • What is the absolute risk reduction (ARR) percent? Answer (c/(c+d))-(a/(a+b)) = 0.71% (Pfizer/BioNTech) and 1.14% (Moderna).
  • Number needed to be vaccinated (NNV) to prevent one infection? Answer  NVV=1/ARR=141 (Pfizer/BioNTech) and 87 (Moderna).

Perhaps we should examine these data graphically.



Do you find any of these things to be interesting? Would any of them help you with decision making or in understanding the current situation? If we also had data on adverse events from the phase III trials we could calculate number needed to harm, at least from the phase III trial data, which is very short-term. We also need more data on mortality to calculate the reduction in deaths from the vaccines. It is known that mortality varies widely among different age groups and racial or ethnic groups, so the conclusions would need to be broken out accordingly.

In short, we can use statistics and graphics to help us in a variety of ways. If you can’t find this in a media report you can do your own research and “roll your own” results.


  • Alan Madison December 12, 2021 at 3:05 pm Reply

    Tom… in classic fashion, you have illustrated how using statistics gives us a simple and very important picture of risk /risk mitigation related to early vaccination outcomes.

    Thank you.

    Sadly, if this statistical view were widely understood and applied in public health policy we could have reset many of the economic and social effects of misguided and unnecessary recommendations. I also believe (and have consistently stated) from the very beginning of the identification of CoVid-19 that a close study of early mortality/hospitalization outcomes stratified by co-morbid risk factors would have helped to define a balanced and targeted set of public health policy recommendations.

    There are so many examples of how statistics applied objectively and comprehensively would have appropriately informed public health policy recommendations. The tragedy is that moving forward an objective and comprehensive review of outcomes will likely continue to be contrived and predetermined as a mechanism to inform public health policies. Rather, public health policy must be balanced with an objective understanding of health, social and economic/financial outcomes to insure the overall health and wellness of people across the world.

    I believe that when we look back at health organization inputs across the world coupled with U.S. federal and state public health policy reactions to CoVid-19 it will illustrate one of the most catastrophic examples of mis-using data and power in world history.


    • Thomas Pyzdek December 13, 2021 at 7:49 am Reply

      alan, sadly I agree with you 100%. Especially with this statement “I believe that when we look back at health organization inputs across the world coupled with U.S. federal and state public health policy reactions to CoVid-19 it will illustrate one of the most catastrophic examples of mis-using data and power in world history.” I fear that we have yet to determine what the deleterious effects of our response will be. I’m very concerned about what my grandchildren will live through.

Leave a Reply

Your email address will not be published. Required fields are marked *

Take a tour! In this interactive video you will discover what you will learn in our Lean Six Sigma Black Belt training and certification program.

FREE Sample Lesson