Equivalence testing in demonstrating substantial equivalence for new tobacco products
Under Section 905j of the Tobacco Control Act, manufacturers must submit reports to the FDA to demonstrate that a new tobacco product is substantially equivalent (SE) to a “predicate product.” Demonstrating equivalence with traditional statistical significance testing is, however, challenging for two reasons. First, the tests are specifically designed to demonstrate that two samples are different (i.e., to reject the null hypothesis that the samples are the same), and it is inappropriate to conclude equivalence when the null hypothesis cannot be rejected. Second, with very large samples, small differences can rise to statistical significance, but not actually be meaningful. As a result, there may be value in using equivalence testing, which examines whether or not the difference between means is smaller than a smallest effect size of interest by testing two null hypotheses (i.e., larger than the upper bound and smaller than the lower bound). If both hypotheses are rejected, the two means are essentially equivalent. Here we report on results of consumer testing in which likelihood of use, appeal, and risk perceptions were measured in a large sample (n=4,720) of current, former, and never tobacco users. The findings demonstrate that traditional statistical significance t-tests and equivalence tests generally yield similar results (i.e., significant differences are not equivalent), but that the two types of tests yield divergent results when: (a) sample sizes are very large and effect sizes are very small (i.e., significant differences that are not equivalent); or (b) sample sizes are small and effect sizes are large (i.e., differences are neither significant nor equivalent). Consideration is given to how to interpret discrepant findings in the context of SE product applications.