That Hideous Strength
Back to Kirsch et al. A study which claimed that antidepressants provide clinically significant benefits only for the severely depressed has received a lot of attention
33 google news hits and lots of plain google hits.
Much commentary completely missed the "clinically significant" and, in fact claimed that Kirsh et al had shown that patients who received a placebo did "just as well" as patients who received antidepressants. This was a null hypothesis overwhelmingly rejected by Kirsch et al.
The huge amount of attention received by the paper is, I think, entirely due to the appeal to the concept of "clinical significance" as authoritatively defined by the National Institute for Health and Clinical Excellence (NICE) in the UK. Hmm where have I read that acronym before ?
NICE declared that to be clinically significant a benefit had to be 3 points on the Hamilton Rating Scale of Depression or at least one half of one standard deviation of the changes in the HRSD in the treated subsample.
The first definition is arbitrary and, I think, nonsensical (if the only change were from "I think life is not worth living" to "I think life is worth living" that would be one point on the HRSD). However the second definition is much much more absurd.
The standard deviation of changes is very important for testing whether an apparent benefit could be due to chance. It is useful for constructing a confidence interval around estimated benefits. It is not useful for determining clinical significance.
I think an example should be sufficient to prove this. Assume there are two huge controlled trials of drug A and Drug B. Each has a subsample of patients given a placebo. These people show improved depression with an average improvement of HRSD of 5 (in both trials). The standard deviation of changes in HRSD is 5 in both Placebo subsamples. Change over standard deviation is 5/5.
Patients in the trial of drug A who received drug A have an average improvement of 8 with a standard deviation of 5. change/standard deviation is 8/5 which is 0.6 higher than 5/5 so it is concluded that drug A provides a clinically significant benefit.
For each patient who receives drug A there is a patient who receives drug B (convenient coincidence). For 90 % of patients who receive drug A there is a patient who received drug B and who had exactly the same change in HRSD. For 10% of patients who got drug A (and who had the same average benefit as the 90%) there is a patient who received drug B whose improvement was greater by 20 HRSD points.
The mean change with drug B is 8 + 0.1*20 = 10.
Given the assumption in parentheses, the variance of changes for patients who received drug B is 25 + (400)(0.1)(0.9) = 61 so the standard deviation is 7.81
change/standard deviation is 1.28. 1.28-1 = 0.28 < 0.5 so NICE declares that drug B does Not have a clinically significant benefit.
But wait a minute, experience with drug B shows first order stochastic dominance over improvement with drug A. The problem with drug B is that a few patients had a wonderful experience. This added proportionally more to the standard deviation than to the mean (has to do with square root of 0.1 is much bigger than 0.1).
Using the mean divided by the sub sample standard deviation to assess clinical significance is utterly idiotic.