Personal chat with
pjupdated
Pyjamas in bananas and I are trying to figure out what Kirsch et al did in Kirsch et al 2008. I have a thought on one of the deepest mysteries -- why are their confidence intervals on d so huge ?
notation: Change is the average change in the Hamilton scale of depression for patients who got an SSRI. Pchange is the average change for those who got a placebo. N is the number of patients who got ssri's (pn the number who got placebo).
d is change divided by the estimated standard deviation of the changes in the study so the t-statistic to test the hypothesis that the true expected change is zero is
d times the square root of n. pd same for placebo. Range is the difference between the upper and lower bounds of the 95% confidence interval for d. Prange same for pd.
rnrange = range times the square root of n, pnrprange prange times squar toot of pn.
some more regressions
reg rnrange change
Source | SS df MS Number of obs = 35
-------------+------------------------------ F( 1, 33) = 39.43
Model | 4.65045538 1 4.65045538 Prob > F = 0.0000
Residual | 3.89223357 33 .117946472 R-squared = 0.5444
-------------+------------------------------ Adj R-squared = 0.5306
Total | 8.54268895 34 .251255557 Root MSE = .34343
------------------------------------------------------------------------------
rnrange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
change | .1834387 .0292136 6.28 0.000 .1240031 .2428743
_cons | 3.590184 .3110234 11.54 0.000 2.957402 4.222966
------------------------------------------------------------------------------
My isn't that a large t-statistic.
Also
. reg pnrprange pchange
Source | SS df MS Number of obs = 35
-------------+------------------------------ F( 1, 33) = 81.05
Model | 3.92402808 1 3.92402808 Prob > F = 0.0000
Residual | 1.59773718 33 .048416278 R-squared = 0.7106
-------------+------------------------------ Adj R-squared = 0.7019
Total | 5.52176525 34 .16240486 Root MSE = .22004
------------------------------------------------------------------------------
pnrprange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pchange | .1676368 .0186208 9.00 0.000 .1297524 .2055211
_cons | 3.652399 .1451204 25.17 0.000 3.357149 3.947648
------------------------------------------------------------------------------
Oh my that is even huger. Oh I guess I should control for the size of the sample as for small samples the t-distribution has tails much fatter than the normal. OK
. reg pnrprange pchange pn
Source | SS df MS Number of obs = 35
-------------+------------------------------ F( 2, 32) = 54.22
Model | 4.2636823 2 2.13184115 Prob > F = 0.0000
Residual | 1.25808295 32 .039315092 R-squared = 0.7722
-------------+------------------------------ Adj R-squared = 0.7579
Total | 5.52176525 34 .16240486 Root MSE = .19828
------------------------------------------------------------------------------
pnrprange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pchange | .1825164 .0175267 10.41 0.000 .1468158 .2182171
pn | -.0035427 .0012053 -2.94 0.006 -.0059978 -.0010876
_cons | 3.726652 .1331891 27.98 0.000 3.455355 3.99795
------------------------------------------------------------------------------
I stress the placebo results, because I know that the sample variance of the change of those who got placebo is not very correlated with pchange. Still pchange has a t-statistic which, to put it mildly is HUGE.
How about weighting by the number who received the placebo in the trial ?
. reg pnrprange pchange pn [pweight = pn]
(sum of wgt is 1.8410e+03)
Linear regression Number of obs = 35
F( 2, 32) = 87.57
Prob > F = 0.0000
R-squared = 0.7761
Root MSE = .17404
------------------------------------------------------------------------------
| Robust
pnrprange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pchange | .1833535 .01469 12.48 0.000 .1534309 .2132761
pn | -.0029935 .0004081 -7.33 0.000 -.0038248 -.0021622
_cons | 3.682386 .1179811 31.21 0.000 3.442067 3.922706
------------------------------------------------------------------------------
OK it is clear that larger average change implies a broader confidence interval of d.
Hmmm why ? If true d were zero, then d times root n would have a t-distribution and the confidence interval of d*n^0.5 would rapidly converge to 2*(1.96) = 3.92.
Clearly the confidence intervals are calculated using the non centered t-like distribution, that is calculated using estimated change not under the null of change = 0.
Hmm d = change/sdhat. as n goes to infinity change goes to the population mean (m) and sdhat goes to sd. Linearizing around m/sd gives
d - m/sd is approximately equal to (change-mean)/sd - m(sdhat-sd)/sd^2
If this is calculated substituting estimates for nuisance parameters it becomes
(change-mean)/sdhat - change(sdhat-sd)/sd^2.
Ohooo that explains it. There is a term proportional to change. Now if I knew the plim of n((sdhat-sd)/sd)^2 I could recalculate the intervals. Odd Gauss tells me it is 0.98 so I guess it is really 1.
gauss program
i = 0;
c=0;
niter = 100000;
do while i < niter;
i = i+1;
a = rndn(10000,1);
c = c+((a'a)^0.5-1)^2/10000;
endo;
c= c/niter;
c;
gauss output
0.98001977
so for large n ((sdhat-sd)/sd)n^0.5 is roughly a unit normal as is ((sdhat-sd)/sdhat)n^0.5. so ((sdhat-sd)/sdhat^2)n^0.5 converges to a normal with variance sd^(-2)
((change-mean)/sdhat)n^0.5 converges to a unit normal so (d-(m/sd) converges to a normal with variance 1 + (m/sd)^2
A very rough sample approximation would be a normal with variance 1+d^2
so this crude confidence interval would be
range = 3.92*(1+d^2)^0.5/n^0.5
OK back to stata
. gen guess = 3.92*(1+d^2)^0.5
. reg rnrange guess
Source | SS df MS Number of obs = 35
-------------+------------------------------ F( 1, 33) = 193.71
Model | 7.29918489 1 7.29918489 Prob > F = 0.0000
Residual | 1.24350406 33 .037681941 R-squared = 0.8544
-------------+------------------------------ Adj R-squared = 0.8500
Total | 8.54268895 34 .251255557 Root MSE = .19412
------------------------------------------------------------------------------
rnrange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
guess | .6250211 .044908 13.92 0.000 .533655 .7163872
_cons | 1.534459 .2874405 5.34 0.000 .9496573 2.119261
------------------------------------------------------------------------------
. reg rnrange guess [pweight = n]
(sum of wgt is 3.2920e+03)
Linear regression Number of obs = 35
F( 1, 33) = 412.96
Prob > F = 0.0000
R-squared = 0.8869
Root MSE = .13623
------------------------------------------------------------------------------
| Robust
rnrange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
guess | .6686736 .0329047 20.32 0.000 .6017285 .7356187
_cons | 1.12118 .2094582 5.35 0.000 .6950337 1.547325
------------------------------------------------------------------------------
. reg rnrange guess if n>100
Source | SS df MS Number of obs = 10
-------------+------------------------------ F( 1, 8) = 198.03
Model | .678477025 1 .678477025 Prob > F = 0.0000
Residual | .027409729 8 .003426216 R-squared = 0.9612
-------------+------------------------------ Adj R-squared = 0.9563
Total | .705886753 9 .078431861 Root MSE = .05853
------------------------------------------------------------------------------
rnrange | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
guess | .6539925 .0464743 14.07 0.000 .5468227 .7611624
_cons | 1.159944 .2917518 3.98 0.004 .4871627 1.832725
------------------------------------------------------------------------------
Well that is a large R squared isn't it. I'm using large n because my formula is asymptotic. However. the coefficient is not 1 at all is it ?
It looks as if the confidence interval isn't +/- 1.96 standard deviations. dividing rnrange by guess averaging and multiplying by 1.96 I get that the confidence interval is +/- 1.7 standard deviations of d. In other words it is a 90% interval not a 95% interval.
The fit isn't perfect, because I used a normal approximation and the range is larger than I guess for small samples as it should be. Maybe they did something like a t distribution with n-1 degrees of freedom plus a normal with variance d^2 or something.
Also the difference between n^2* the size of the interval and my guess isn't strictly a decreasing function of n. I guess this is due to rounding of d and the limits of the interval.
Now the square size of the confidence interval should *NOT* be used to calculate a precision weighted average of d. Such an average would obviously be biased down.
update:
Hmm there is no reason for the confidence interval to be centered around d given the squewy destribution of 1/sdhat. Is it.
duced is the upper limit - d and range is upper limit minus lower limit
reg duced range
Source | SS df MS Number of obs = 35
-------------+------------------------------ F( 1, 33) =55765.08
Model | 1.02594697 1 1.02594697 Prob > F = 0.0000
Residual | .000607123 33 .000018398 R-squared = 0.9994
-------------+------------------------------ Adj R-squared = 0.9994
Total | 1.02655409 34 .030192767 Root MSE = .00429
------------------------------------------------------------------------------
duced | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
range | .4999338 .0021171 236.15 0.000 .4956267 .504241
_cons | -.0006627 .0018018 -0.37 0.715 -.0043285 .003003
------------------------------------------------------------------------------
Now that is a really bit t-statistic. The confidence interval is symmetric.
My guess is that the finiteness of the sample is considered when calculating the 90% interval of (change-mean)/sdhat and not when calculating the 90% interval of change(sdhat-sd)/sd^2 (which is even further from normal for a given finite sample but not a distribution written into a software package so let's assume a squewed distribution is normal and insist on the non-normality of the t distribution.)
Thus new guess is the interval is d +/- ((t(.95,n-1))^2 + (1.645d)^2)^0.5
where 1.645= n(.95) and t(95,n-1) is the 95th percentile of the t with n-1 degrees of freedom. This makes no particular sense as it has nothing to do with the 90% interval of a t plus a normal.
And, in fact, my new guess works little better than my old guess.