Robert's Stochastic thoughts: The Jeffreys prior

Saturday, November 24, 2018

The Jeffreys prior

Over at twitter I learned a lot. I claimed (and claim) that there is no such thing as an uninformative prior. I also claim that the penalty functions multiplied by likelihoods and called priors are not priors. This lead to a debate which was as uninformative as prior debates on the topic. A lot of my obsessions are semantic.

I was also taught about someone called Harold Jeffres who presented something which he called a prior. OK so Wikipedia taught me (twitter being unsuited for explaining things to me (or anyone)). His prior is proportional to the square root of the determinant of the Fisher information matrix. The Fisher information matrix is -1 times the second derivative of the expected log likelihood with respect to a parameter vector theta evaluated at the true theta. It is also the variance covariance matrix of the gradient of the log likelihood at the true theta (the two matrices are identical).

The Fisher information matrix is a function of theta. A penalty which depends on the Fisher information matrix is a function of theta. It can be called a prior (I reserve the term for sincere beliefs).

The point of Jeffreys's prior is that it is invariant under any reparametrization of the model. if phi = g(theta) and g is one to one, then the posterior distribution of phi given Jeffreys's prior on phi will imply exactly the same probabilities of any observable event as the posterior distribution of theta given Jeffrey's prior on theta.

This is true because if theta is distributed according the Jeffres's prior on theta, and phi = g(theta) then phi is distributed according to Jeffreys's prior on phi.

The gradient of the expected log likelihood with respect to theta is the gradient with respect to phi times the Jacobian of g. This means that Jeffrey's prior transforms the way probability densities do and the Jeffreys prior on theta implies the same distribution of phi as the Jeffreys prior on phi.

I am quite sure this is simply because the gradient of the expected log likelihood with respect to theta is a gradient of a scaler valued function of theta. for any scaler valued h(theta) I think the square root of (the gradient of h)(the gradient of h)' would work just as well. For example, if one used the gradient of the likelihood rather than the log likelihood, I think the resulting prior would be invariant as well.

Now except for the expected log likelihood, the Hessian (second derivative) is not equal to minus the product of the gradient and the gradient prime. That implies that for every h() except for the log likelihood there are two invariant priors the square root of the determinant of the expected value of (the gradient times the graident prime) and the square root of the determinant of the expected value of the second derivative.

I think this means that the set of invariant priors is basically about as large as the set of possible probability distributions of theta. Given a prior over theta, invariance implies a prior over any one to one function of theta, but this seems to me to be a statement about how to transform priors when one reparametrizes (which is just the formula for calculating the probability density of a function of a variable with a known probability density).

The log likelihood is a very popular function of parameters and data, but I see no particular reason why a distribution calculated using the log likelihood is more plausible than any other distribution. I don't see any particular appeal of Jeffreys prior. I think one does just as well by choosing a parametrization and assuming a flat distribution for that parametrization.

I don't think I have ever seen Jeffreys prior, that is, I don't think I have ever seen it used.

23 comments:

casinositeone.JDS said...: Thanks for sharing your information!; 2:21 PM
casinositeguidecom.JDS said...: Thanks for sharing your this great work of yours.; 2:22 PM
safetotositepro.JDS said...: Wow this blog is awesome. Wish to see this much more like this.; 2:22 PM
casinositerank said...: Really appreciate this wonderful post that you have here.; 6:36 AM
totosafesite said...: I want to read more things about here! thanks for the info.; 6:37 AM
outlookindia.com said...: Many thanks for sharing this one. A must read article!; 6:53 PM
gostopsite.com said...: Thanks for information.keep sharing more articles.; 6:53 PM
casinositerank.com said...: That is a good tip especially to those new to the blogosphere.; 6:54 PM
sportstotomen.com said...: Such an amazing and helpful post this is.; 6:55 PM
19guide03.com said...: Some really useful stuff on here, keep up posting. Cheers.; 6:56 PM
slotplayground.com said...: Thank you and best of luck.; 6:58 PM
safetotosite said...: Excellent post. I used to be checking constantly this weblog, so Keep it up!; 4:54 AM
casinositenet said...: I am impressed! Extremely helpful info particularly the remaining section :); 4:54 AM
casinositekim said...: I have learn several good stuff here. Thanks for sharing this buddy; 4:55 AM
mttotosite said...: You’ve made some decent points there. This is great article, Thankyou!; 4:55 AM
sportstoto365 said...: This is a great inspiring article. Good work you have on this. Keep it up.; 4:55 AM
cmriindia said...: You put helpful information. Keep blogging man. Thankyou for sharing; 4:56 AM
파워볼실시간 said...: I located the information very useful. You're a great author in this generation, thanks; 4:57 AM
안전 토토사이트 said...: Hi there it’s me, this website is actually nice and the users are really sharing nice thoughts.; 4:57 AM
메이저 토토사이트 said...: An interesting discussion is definitely worth comment. Write more, All the best!!; 4:58 AM
바카라사이트 추천 said...: Valuable info. Thanks I discovered this awesome website here.; 4:58 AM
슬롯사이트 순위 said...: Lot of informative blog are provided here, Happy to read this good post. Thanks a lot; 4:59 AM
Anonymous said...: The Jeffreys prior is an important concept in Bayesian statistics, especially when aiming for parameter-invariant priors. It’s insightful how it balances subjectivity and objectivity in modeling. For those exploring statistical applications in research and learning, NURS FPX 4010 Assessment 4 offers excellent resources that connect theory with practical understanding.; 12:25 PM