Robert's Stochastic thoughts: 11/1/18

Saturday, November 24, 2018

The Jeffreys prior

Over at twitter I learned a lot. I claimed (and claim) that there is no such thing as an uninformative prior. I also claim that the penalty functions multiplied by likelihoods and called priors are not priors. This lead to a debate which was as uninformative as prior debates on the topic. A lot of my obsessions are semantic.

I was also taught about someone called Harold Jeffres who presented something which he called a prior. OK so Wikipedia taught me (twitter being unsuited for explaining things to me (or anyone)). His prior is proportional to the square root of the determinant of the Fisher information matrix. The Fisher information matrix is -1 times the second derivative of the expected log likelihood with respect to a parameter vector theta evaluated at the true theta. It is also the variance covariance matrix of the gradient of the log likelihood at the true theta (the two matrices are identical).

The Fisher information matrix is a function of theta. A penalty which depends on the Fisher information matrix is a function of theta. It can be called a prior (I reserve the term for sincere beliefs).

The point of Jeffreys's prior is that it is invariant under any reparametrization of the model. if phi = g(theta) and g is one to one, then the posterior distribution of phi given Jeffreys's prior on phi will imply exactly the same probabilities of any observable event as the posterior distribution of theta given Jeffrey's prior on theta.

This is true because if theta is distributed according the Jeffres's prior on theta, and phi = g(theta) then phi is distributed according to Jeffreys's prior on phi.

The gradient of the expected log likelihood with respect to theta is the gradient with respect to phi times the Jacobian of g. This means that Jeffrey's prior transforms the way probability densities do and the Jeffreys prior on theta implies the same distribution of phi as the Jeffreys prior on phi.

I am quite sure this is simply because the gradient of the expected log likelihood with respect to theta is a gradient of a scaler valued function of theta. for any scaler valued h(theta) I think the square root of (the gradient of h)(the gradient of h)' would work just as well. For example, if one used the gradient of the likelihood rather than the log likelihood, I think the resulting prior would be invariant as well.

Now except for the expected log likelihood, the Hessian (second derivative) is not equal to minus the product of the gradient and the gradient prime. That implies that for every h() except for the log likelihood there are two invariant priors the square root of the determinant of the expected value of (the gradient times the graident prime) and the square root of the determinant of the expected value of the second derivative.

I think this means that the set of invariant priors is basically about as large as the set of possible probability distributions of theta. Given a prior over theta, invariance implies a prior over any one to one function of theta, but this seems to me to be a statement about how to transform priors when one reparametrizes (which is just the formula for calculating the probability density of a function of a variable with a known probability density).

The log likelihood is a very popular function of parameters and data, but I see no particular reason why a distribution calculated using the log likelihood is more plausible than any other distribution. I don't see any particular appeal of Jeffreys prior. I think one does just as well by choosing a parametrization and assuming a flat distribution for that parametrization.

I don't think I have ever seen Jeffreys prior, that is, I don't think I have ever seen it used.

Friday, November 16, 2018

I disagree with Jennifer Rubin

Conservatives object to the Washington Post defining never-ever-ever Trumper Jennifer Rubin as a conservative. Reflecting, I had to admit that I hadn't disagreed with anything she wrote for months. Now, finally, I do. But, sadly, this isn't evidence that she is still a conservative. She has clearly become a radical centrist third way mugwump (RC3WM).

She argues that the 2018 blue wave shows that Democrats should reject Bernie Sanders and rely on a poll conducted by "The Third Way".

I think this is nonsense consisting entirely of setting up an oxymoronic straw man and pretending that values shared by conservatives, liberals, centrists, progressives, socialists, and fascists belong to conservatives.

Her column.

My comment

I don't see any evidence that people rejected Sen Sanders's policy proposals, which are actually fairly moderate. It is very easy to get issue poll results one wishes by choosing the questions. Notably, the ACA is only moderately popular (50% approval) while Medicare for all has 70% approval (recently including Donald Trump).

On entitlements the moderate centrist approach is to achieve trust fund solvency with balanced tax increases and benefit cuts. The vast majority of the public wants more generous pensions, expanded medicaid and an increased Medicare budget.

The case that Americans are conservative is that they believe in hard work (and the family). Notably, this is the position of mainstream and left wing Democrats. Conservatives have convinced each other than non-conservatives are Leninist hippies. Totally aside from the fact that "Leninist hippy" is an oxymoron, very few people are Leninists and very few are hipplies.

Conservatives point to the free love socialist demon Nancy Pelosi- who is the mother of 5 children. I'm sure she loves them dearly but I think it has something to do with her deep religious faith which prevented her from using artificial birth control (see flustered Stephen Colbert). I don't know if she has changed her views,since she has had perfectly natural menopausal birth control for about 3 decades by now.

The idea that conservatives have something to offer which a majority of Americans want is based on falsehoods about the alternative (which have convinced a solid minority of Americans but which are, nonetheless false).

Elisabeth Warren, Kamala Harris, Corry Booker, and Sherod Brown are Berny Sanders with better manners. They have the same policy proposals. I really hope one of them is elected in 2020.