Robert's Stochastic thoughts: 2024

Saturday, October 05, 2024

Treatment of Autoimmune Diseases

I have a thought so blindingly obvious that I don’t know why I haven’t read about it. First the thought then wondering why I haven’t found it written.

The idea is to treat with a chimera of an antibody to a tissue specific antigen and PDL1 (note how quickly it can be stated).

PDL1 (programmed death ligand 1) interacts with PD1 (programed death 1) On Kiler T cells and Natural Killer cells. The acronyms suggest that these cells then die. However, it is now known that they live on without killing the cell which displays PDL1. Blocking this “checkpoint” Is a very major dramatic Nobel Prize winning step towards effective immunotherapy of cancer.

Killing by lymphocytes is often a problem: Type 1 diabetes, Multiple sclerosis, Celiac Disease, Ulcerative Colitis and other less common disorders. It seems obvious that the PDL1/PD1interaction could be very useful. A problem with immunosuppressants is that they leave the patient vulnerable to infections. They are still used, however it clearly would be useful to focus the immunosuppression on the tissue where the immune response is causing trouble.

This should be easy if there is a monoclonal antibody which binds to that tissue. For example pancreatic Islet cells (beta cells) display a characteristic antigen Zinc Transporter-8 (ZnT8) cells There is a monoclonal antibody to this antigen . It seems to me that a chimera of that antibody and PDl1 is worth exploring.

It is possible that by the time Diabetes is diagnosed it is too late (there are beta cells in the pancreases of people with type 1 diabetes but they may have survived by becoming irreversibly dormant). Also Islet cells for transplant might usefully be decorated with the chimera.

For multiple sclerosis and related diseases a myelin specific might be useful. Such antibodies exist and create trouble when not attached with PDL1.

Directing Lymphokines to the desired cells.

I was looking for an earlier article when I found this hot off of the presses Engineered cytokine/antibody fusion proteins improve IL-2 delivery to pro-inflammatory cells and promote antitumor activity

The earlier article discusses an antibody which binds to IL-2, does not block its action and, for some reason, causes it to bind more to inflammatory cells and less to TREGs

I am interested in other ways to direct IL-2 and IL-15 to CD8 killer cells and NK cells. I propose starting with bifunctional antibodies – a 1960s technology which yields an gamma globulin with 2 different Fab components by breaking and remaking the disulfide bonds between Fc components. Here one Fab could be from the Leonard and bind to IL-2. Another possibility is a non blocking antibody to the sushi domain of IL-15 R alpha with IL-15 irreversibly bound to that or a non blocking antibody to IL-15 itself.

In each case the Fab binds to an interleukin and does not block its action The other Fab (in the very preliminary 60s tech experiment) is either anti CD8, anti cd56, or anti NKG2A.

The logic of anti-CD8 is clear. It would direct that IL-2 or IL15 to CD8 killer cells as opposed to the (most common) CD4 TREGs. Also importantly it would direct them to lymphocytes and not the walls of capillaries reducing the dose limiting capillary leak toxicity for a given level of stimulation of the lymphocytes.

Similarly cd56 is characteristic of NK cells (and IL15 has a very dramatic effect on NK cell proliferation).

The case for anti NKG2A is a bit more complicated and interesting. NKG2A is found on NK cells and some CD8 cells. It is an inhibitory receptor which responds to HLA-e. Monalizumab is an FDA approved monoclonal which blocks NKG2A .

Wednesday, September 18, 2024

A Song of Ice and Fire

Just wasting time. I refer to the series of books and not to the TV series*.

1) Now public – after the first book, I was convinced that Jon Snow was the son of Lyana Stark and Rhaegar Targaryen

2) It is very clear that the knight of the laughing tree is Lyana Stark. For one thing Rhaegar rode after that knight after that knight embarrassed the knights at the tournament. The travelling cranogman is, I think, a diversion.

3) Balerion the Dread II: I am sure that the black cat with one ear is the Targaryen pet cat knicknamed Balerion the dread. The cat appears chased by Arya Stark then attacks Tommen Baratheon. Arya escapes, finds the skulls including that of Balarion the dread I and finds Varys plotting with the rich cheese guy (in tunnel later used by Eddard Stark and Peter Baelish. Cat reported to have stolen chicken from tywin Stark when he was at dinner with the crazy king. Reported outside of a window scaring Tommen later.

4) The horn of Joramun. In the mysterious bundle containing obsidian blades there is a horn which makes no sound. I am sure it is the horn which wakes giants.

5) It is fairly clear that ice-hands is the Eddard Stark’s younger brother and that he left the bundle to be found as well as rescuing Samwise Tarley etc.

6) It is reasonably clear that the pork he brought for Brandon, Hodor and the crannog-siblings is long pork (human flesh).

7) In the house of the immortals Danerys sees a massacre of people with wolve’s faces. Clearly the red wedding. She also sees a newborn baby. Parents say he will be a great conqueror and they will name him Aegon.. They are Rhaegar and Lyanna. She sees someone with blue eyes in a boat. Clearly a wight coming from Hardholm

8) The person born between salt and fire is Danerys. She is also the younger and more beautiful queen about whom Circe Lannister was warned.

9) The younger brother iwho is to strangle her is Jaime – minutes not years younger.

10) Danerys must go East to go West. Also, she must pass through the Shadow beyond Ashai. A sword tempered in the blood of a beloved’s heart is alarming.

11) The dragon has 3 heads – Danerys, Jon, and one more. Either Faegon is not fake, or Tyrion Lannister is, as rumored, a bastard of the crazy king.

12) Of course Sandor Clegane is the tall man digging graves on the isle of repentance and the Hound not he is the angry man who died (why did I not realize this ?).

13) I am pretty sue that Mance is not, in fact dead

Odd how I remember details but not names

*I think the parts of the series I watched was excellent – very different from but not inferior to the books – I did not watch the 7th season.

Wednesday, August 14, 2024

Does Autophagy slow aging and slow the progression of neuro-degenerative diseases ?

Warning: this is a reflection of my theoretical interest. A foolish person might mistake it for advice on how to delay aging. Do not take medical advice from me. I am an economist. Autophagy means self eating. It refers to the process through which cells send things in the cell to lysosomes to be broken down to their components. I will mainly type about macro-autophagy which refers to a fairly large volume surrounded by a double membrane called and autophagosome which fuses with the lysosome. Autophagosomes are large enough to engulf entire mitochondria (the organelle which produces ATP using glucose and oxygen). This is important because defective mitochondria can produce reactive oxygen species (ROS) (think hydrogen peroxide) which damage cells. Autophagy also removes aggregates of proteins stuck to each other. This may be very important as such aggregates have a role in Alzheimer’s disease and probably also Parkinson’s disease and Huntington’s disease

It is often argued that autophagy slows aging. This argument is largely based on the effect of molecules which promote autophagy slowing aging of rats and mice.

There is also very strong evidence that autophagy prevents (or delays) neurodegenerative diseases. Here the evidence is quite strong. (Homozygous) genetic defects ni proteins involved in autophagy cause increased risk of Parkinson’s disease and (different defects) Amyotrophic lateral sclerosis.

This makes chemical which stimulate autophage extremely interesting. Unfortunately the best known (don’t eat this at home kids) is Rapamycin which promotes autophagy by inactivating the Mammalian Target of Rapamycin complex 1 (MTORC1). Unfortunately this also Rapamycin an potent immunosuppressant, so it is very dangerous to eat it just because it does indeed slow rodent aging.

One approach to seek a safer promoter of autophagy is to develop molecules similar to Rapamycin (rapalogs) and check if they suppress the immune system, hoping they don’t Of course they do.

Update: How could I have forgotten to mention this in an earlier draft. Simvastatin is a pharmaceuitical which promotes autophagy. Satins are used to control high blooed cholesterol. They have a low chance of causing Rhabdomyolosis (overall a one in a million risk of causing death). I guessed that they reduc the risk of Alzheimer's disease. I just googled and found "Specifically, statin use demonstrated a 28% risk reduction in Alzheimer's disease, 18% risk reduction in vascular dementia, and 20% risk reduction in unspecified dementia." end update

There are definitely safe molecules which are alleged to promote autophagy. One is resveratrol. Long ago it was observed that large doses of resveratrol appeared to cause slower aging in rats. There is a fairly large literature on this topic in peer reviewed journals. As a result (somewhat embarrassingly) resveratrol is included in many unregulated dietary supplements (also you can buy it on Amazon.com). It is a natural compound (found in red grapes) with no known serious side effects, so it is basically not regulated.

There are two definite problems with resveratrol. Most of it is not absorbed by the intesting and most which is absorbed is rapidly metabolized by the liver. For it to be effective (for whole mammals such as ourselves) it must be very potent.

An very similar alternative molecule more of which is absorbed and which has a longer half life is pterostilbene (found in blueberries). There are many fewer articles in the peer literature on pterostilbene than on resveratrol.

Spermadine is a poly amine which promotes autophagy there is evidence that oral spermadine slows aging in mice.

As an aside – the peer reviewed biomedical literature is huge. Obscure journals are indexed on pubmed.ncbi.nlm.nih.gov . Finding something there does not mean it has cleared a high bar.

There are various articles which assert that relatively potent (and well absorbed) autophagy inducing pharmaceuticals have been found using high throughput screening. This is part of the effort to repurpose pharmacuticals – to find new uses for FDA approved pharmaceuticals. The point is that the FDA does not regulate doctors who can prescribe pharmaceuticals off label. They do not list the same molecules. Many cause lower blood pressure. Some are used to prevent migraines. They do not suppress the immune system.

The pharmaceutical on at least one of the lists include

Clonidine

Minoxidil (yes the stuff bald guys use to make hair grow – it also causes lower blood pressure if taken internally in higher doses)

Rilmenidine

and

Rizatriptan. (used for migrains does not lower blood pressure)

Sunday, June 16, 2024

A Natalist, Nativist, Nationalist Case for the Child Tax Credit

One of the policies with the greatest effect on poverty is the ild tax credit (expanded and made fully refundable by the American Reesue Plan). It caused a 44% reduction of child poverty. UNfortunately it was a tempory one year program (approptiate for stimulus bu not for an always needed prgram). It was not renewed and there was a huge increase in poverty

This is a hugely important policy issue (currently totally impossible with GOP control of the House). There was not overwhelming support for the program -- fully refundable sounds like welfare. To be blunt, some of the money went to African Americans. I think there is a rhetiral trick which might maybe even work. I think the history of generoud child credits suggests that fighting poverty was not the only objective. There was also the goal of increasing natality, often frankly aimed at producing footsoldiers for the next war. The aim was to create incentives to make babies.The concern was low fertility not (just) high child poverty.

I think a case for an expanded child tax credit can be made which might convince some natalist natavist nationalist Americans who are alarmed that imigration will be necessary to have people to pay the payrol tax to support boomers. The arbument is that native born Americans are dying off with fertility below the replacement rate. In 2022 it was1.67 well below the replacement rate of 2.1 In this the USA has joined other rich countries (which are notably more generous with children that the USA).

Logically this should terrify the numerouse xenophabic nationalists who now oppose the expanded child tax credit. Logically they should support incenI type,tives for US citizen adullts to make babies. "Logically" hah, they do not use this "logic" of which I type. But hey it's worth a try

Thursday, March 07, 2024

Avatars of the Tortoise III

In "Avatars of the Tortoise" Jorge Luis Borges wrote "There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite."

He concluded ""We (the indivisible divinity that works in us) have dreamed the world. We have dreamed it resistant, mysterious, visible, ubiquitous in space and firm in time, but we have allowed slight, and eternal, bits of the irrational to form part of its architecture so as to know that it is false."

I think I might have something interesting to say about that and I tried to write it here. If you must waste your time reading this blog, read that one not this one. But going from the sublime to the ridiculous I have been a twit (but honestly not a troll) on Twitter. I said I thought we don't need the concept of a derivative (in the simple case of a scalar function of a scalar the limit as delta x goes to zero of ther ratio delta Y over delta x - I insult you with the definition just to be able to write that my tweet got very very ratiod).

In avatars of the Tortoise II I argued that we can consider space time to be a finite set of points with each point in space the same little distance from its nearest neighbors and each unit of time the same discrete jump littleT from the most recent past. If I am right we don't need derivatives to analyse functions of space, there are just slopes, or time, or position as a function of time (velocity and acceleration and such). In such a model, there are only slopes as any series which goes to zero, gets to zero after a finite number of steps and the formula for a derivative must include 0/0.

I will make more arguments against derivatives. First I will say that we learn nothing useful if we know the first, second, ... nth ... derivative of a function at X. Second I will argue that we can do what we do with derivatives using slopes. Third I will argue that current actual applied math consists almost entirely of numerical simulations on computers which are finite state autometa and which do not, in fact, handle continuums when doing the simulating. They take tiny little steps (just as I propose).

I am going to make things simple (because I don't type so good and plane ascii formulas are a pain). I will consider scalar functions of scalars (so the derivative will be AB ap calculus). I will also consider only derivatives at zero.

f'(0) = limit as x goes to zero of (f(x)-f(0))/x

that is, for any positive epsilon there is a positive delta so small that if |x| This is useful if we know we are interested in x with absolute value less than delta, but we can't know that because the definition of a derivative gives us no hint as to how small delta must be.

To go back to Avatars of the tortoise I, another equally valid definition of a derivative at zero is, consider the infinite series X_t = (-0.5)^t.

f'(0) = the limit as t goes to infinity of (f(x_t)-f(0))/x_t that is, for any positive epsilon there is a positive N so big that, if t>N then |f'(0) - (f(x_t)-f(0))/x_t|< epsilon.

so we have again the limit as t goes to infinity and the large enough N with know way of knowing if the n which interests us (say 10^1000) is large enough. Knowing the limit tells us nothing about the billionth element. The exact same number is the billionth element of a large infinity of series some of which converge to A for any real number A, so any number is as valid an asymptotic approximation as any other, so none is valid.

Now very often the second to last step of the derivation of a derivative includes an explicit formula for f'(0) - (f(x)-f(0))/x and then the last step consists of proving it goes to zero by finding a small enough delta as a function of epsilon. That formula right near the end is useful. Te derivative is not. Knowing that there is a delta is not useful if we have no idea how small it must be.

In general for any delta no matter how small for any epsilon no matter how small, there is a function f such that |f'(0) - (f(delta)-f(0))/delta|>1/epsilon (I will give an example soon). for any function there is a delta does not imply that there is a delta which works for any function. The second would be useful. The first is not always useful.

One might consider the first, second, ... nth derivatives and an nth order Taylor series approximation which I will call TaylorN(x)

for any N no matter how big, for any delta no matter how small for any epsilon no matter how small, there is a function f such that |TaylorN(delta) - f(delta)|>1/epsilon

for example consider the function f such that

f(0) = 0, if x is not zero f(x) = (2e/epsilon)e^(-(delta^2/x^2))

f(delta) = 2/epsilon > 1/epsilon.

f'(0) is the limit as x goes to zero of

-(2e/epsilon)(2delta^2/x^3)e^(-(delta^2/x^2)) = 0.

the nth derivative the limit as x goes to zero of an n+2 order polynomial times e^(-(delta^2/x^2)) and so equals zero.

The Nth order taylor series approximation of f(x) equals zero for every x. for x = delta it is off by 2/epsilon > 1/epsilon.

There is no distance from zero so small and no error so big that there is no example in which the Nth order Taylor series approximation is definitely not off by a larger error at that distance.

Knowing all the derivatives at zero, we know nothing about f at any particular x other than zero. Again for any function, for any epsilon, there is a delta, but there isn't a delta for any function. Knowing all the derivatives tells us nothing about how small that delta must be, so nothing we can use.

So if things are so bad, why does ordinary caluclus work so well ? It works for a (large) subset of problems. People have learned about them and how to recognise them either numerically or with actual experiments or empirical observtions. But that succesful effort involved numerical calculations (that is arithmetic not calculus) or experiments or observations. It is definitely Not a mathematical result that the math we use works. Indeed there are counterexamples (of which I presented just one).

part 2 of 3 (not infinite even if it seems that way but 3. If the world is observationally equivalent to a word with a finite set of times and places, then everything in physics is a slope. More generally, we can do what we do with derivatives and such stuff with discrete steps and slopes. We know this because that is what we do when faced with hard problems without closed form solutions. We hand them over to computers which consider a finite set of numbers with a smallest step. and that quickly gets me to part 3 of 3 (finally). One person on Twitter says we need to use derivatives etc to figure out how to write the numerical programs we actually use in applications. This is an odd claim. I can read (some) source code (OK barely source code literate as I am old but some). I can write (some) higher higher language source code. I can force myself to think in some (simple higher higher language) source code (although in practice I use derivatives and such like). Unpleasant but not impossible.

Someone else says we use derivatives to know if the simulation converges or, say, if a dynamical system has a steady state which is a sink or stuff like that. We do, but tehre is no theorem that this is a valid approach and there are counterexamples (basically based on the super simple one I presented). All that about dynamics is about *local* dynamics and is valid if you start out close enough and there is no general way to know how close is close enough. In practice people have found cases where linear and Taylor series (and numerical) approximations work and other cases where they don't (consider chaotic dynamical systems with positive Lyaponoff exponents and no I will not define any of those terms).

Always the invalid pretend pure math is tested with numerical simulations or experiments or observations. People learn when it works and tell other people about the (many) cases where it works and those other people forget the history and pour contempt on me on Twitter.

Avatars of the Tortoise II

In "Avatars of the Tortoise" Jorge Luis Borges wrote "There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite."

I think rather that we have dreamed of infinity which has no necessary role in describing the objective universe which is "resistant, mysterious, visible, ubiquitous in space and firm in time*".

First the currently favored theory is that space is not, at the moment, infinite but rather is a finite hypersphere. There was a possibility that time might end in a singularity as it began, but the current view is that the universe will expand forever. Bummer. I note however that the 2nd law of thermodynamics implies that life will not last forever (asymptotically we will "all" be dead, "we" referring to living things not currently living people). So I claim that there is a T so large that predictions of what happens after T can never be tested (as there will be nothing left that can test predictions.

However it is still arguable (by Blake) that we can find infinity in a grain of sand and eternity in an hour. Indeed when Blake wrote, that was the general view of phyicists (philosophy makes the oddest bedfellows) as time was assumed to be a continuum with infinitely many distinct instants in an hour.

Since then physicists have changed their mind -- the key word above was "distinct" which I will also call "distinguishable" (and I dare the most pedantice pedant (who knows who he is) to challenge my interchanging the two words which consist of different letters).

The current view is that (delta T)(delta E) = h/(4 pi) where delta T is the uncertainty in time of an event, delta E is the uncertainty in energy involved, h is Planck's constant, pi is the ratio of the circumpherance of a circle to it's diameter and damnit you know what 4 means.

delta E must be less that Mc^2 where M is the (believed to be finite) mass of the observable universe. So there is a minimum delta T which I will call littleT. A universe in which time is continuous (and an hour contains an infinity of instants) is observationally equivalent to a universe in which time (from the big bang) is a natural number times littleT. The time from the big bang to T can be modeled as a finite number of discrete steps just as well as it can be modeled as a continuum of real numbers. This means that the question of which if these hypothetical possibilities time really is is a metaphysical question not a scientific question.

Now about that grain of sand. there is another formula

(delta X)(delta P) = h/(4 pi)

X is the location of something, P is its momentum. |P| and therefore delta P is less than or equal to Mc/2 where M is the mass of the observable universe. The 2 appears because total momentum is zero. This means that there is a minimum delta X and a model in which space is a latice consisting of a dimension zero, countable set of separated points is observationally equivalent to the standard model in which space is a 3 dimensional real manifold. Again the question of what space really is is metaphysical not scientific.

Recall that space is generally believed to be finite (currently a finite hypersphere). It is expanding. At T it will be really really big, but still finite. That means the countable subset of the 3 dimensional manifold model implies a finite number of different places. No infinity in the observablee universe let alone in a grain of sand

There are other things than energy, time, space and momentum. I am pretty sure they can be modeled as finite sets too (boy am I leading with my chin there).

I think there is a model with a finite set of times and of places which is observationally equivalent to the standard model and, therefore, just as scientifically valid. except for metaphysics and theology, I think we have no need for infinity. I think it is not avatars of the tortoise all the way down.

*note not ubiquitous in time as there wass a singularity some time ago.

Wednesday, March 06, 2024

Asymtotically we'll all be dead II

alternative title "avatars of the tortoise I"

Asymptotically we'll all be dead didn't get much of a response, so I am writing a simpler post about infinite series (which is the second in a series of posts which will not be infinite. First some literature "Avatars of the Tortoise" is a brilliant essay by Jorge Luis Borges on paradoxes and infinity. Looking at an idea, or metaphor (I dare not type meme) over centuries was one of his favorite activities. In this case, it was alleged paradoxess based on infinity. He wrote "There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite."

When I first read "Aavatars of the Tortoise" I was shocked that the brilliant Borges took Zeno's non paradox seriously. The alleged paradox is based on the incorrect assumpton that a sum of an infitite numbr of intervals of time adds up to forever. In fact, infite sums can be finite numbers, but Zeno didn't understand that.

Zeno's story is (roughly translated and with updated units of measurement)

consider the fleet footed Achilles on the start line and a slow tortoise 100 meters ahead of him. Achilles can run 100 meters in 10 seconds. The tortoise crawls forward one tenth as fast. The start gun goes off. In 10 seconds Achilles reaches the point where the tortoise started by the tortoise has crawled 10 meters (this would only happen if the tortoise were a male chasing a female or a female testing the male's fitness by running away - they can go pretty fast when they are horny).

So the race continues to step 2. Achilles reaches the point where the tortoise was after 10 seconds in one more second, but the tortoise has crawled a meter.

Step 3, Achilles runs another meter in 0.1 seconds, but the tortoise has crawled 10 cm.

The time until Achilles passes the tortoise is an infinite sum. Silly Zeno decided that this means that Achilles never passes the tortoise, that the time until he passes him is infinite. In fact a sum of infinitely many numbers can be finite -- in theis case 10/(1-0.1) = 100/9 < infinity.

Now infinite sums can play nasty tricks. Consider a series x_t t going from 1 to infinity. If the series converges to x, but does not converge absolutely (so sum |x_t| goes to infinity) then one can make the series converge to any number at all by changing the order in which the terms are added. How can this be given the axiom that addition is commutative. Now that's a bit of a paradox.

the proof is simple, let's make it converget to A. DIrst note that the positive terms must add to infinity and the negative terms add to - infinity (so that they cancel enough for the series to converge).

now add positive terms until Sumsofar >A (if A is negative this requires 0 terms). Now add negative terms until sumsofar The partial sums will cross A again and again. The distance from the partial sum to A is less than the last term as it just crossed A. The last term must go to zero as we get t going to infinity (so the original series can converge) so the new series of partial sums converged to A.

That's one of the weird things infinity does. I think that everything which is strongly counterintuitive in math has infinity hiding somewhere (no counterexamples have come to my mind and I have looked for one for decades).

Now I say that the limit of a series (original series of sum t = 1 1 to T) as T goes to infinity is not, in general, of any practical use, because in the long run we will all be dead. I quote from "asymptotically we'll all be dead"

Consider a simple "problem of a series of numbers X_t (not stochastic just determistic numbers). Let's say we are interested in X_1000. What does knowing that the limit of X_t as t goes to infinity is 0 tell us about X_1000 ? Obvioiusly nothing. I can take a series and replace X_1000 with any number at all without changing the limit as t goes to infinity.

Also not only does the advice "use an asymptotic approximation" often lead one astray, it also doesn't actually lead one. The approach is to imaging a series of numbers such that X_1000 is the desired number and then look at the limit as t goes to infinity. The problem is that the same number X_1000 is the 1000th element of a of an uh large infinity of different series. one can make up a series such that the limit is 0 or 10 or pi or anything. the advice "think of the limit as t goes to infinity of an imaginary series with a limit that you just made up" is as valid an argument that X_1000 is approximately zero as it is that X_1000 is pi, that is it is an obviously totally invalid argument.

An example is a series whose first google (10^100) elements aree one google so x_1000000 = 10^100, and the laters elements are zero. The series converges to zero. If one usees the limit as t goes to infinity as an approximation when thinking of X_999 then one concludes that 10^100 is approximately zero.

The point is that the claim that a series goes to x s the claim that (for that particular series) for any positive epsilon, there is an N so large that if t>N then |x_t-x} This tells us nothing about how large that N is and whether it is much larget than any t which interests us. Importantly it is just not true that there is a N so large that (for any series and the same N) if t > n then |x_t- the limit| < 10^10^(100)

Knowing only the limit as t goes to infinity, we have no idea how large an N is needed for any epsilon, so we have no idea if the limit is a useful approximation to anything we will see if we read the series for a billion years.

Now often the proof of the limit contains a useful assertion towards the end of the proof. For example one might prove that |x_t-X| < A/t for some A. The next step is to note that the limit as to goes to infinity of x_t is X. This last step is a step in a very bad direction going from something useful to a useless implication of the useful statement.

Knowing A we know that N = floor(A/epsilon). That's a result we can use. It isn't as elegant as saying something about limits (because it includes the messy A and often includes a formula much messier than A/t). However, unlike knowing the limit as t goes to infinity it might be useful some time in the next trillion years.

In practice limits are used when it seems clear from a (finite) lot of calculations that they are good approximations. But that means one can just do the many but finite number of calculations and not bother with limits or infinity at all.

In this non-infinite series of posts, I will argue that the concept of infinity causes all sorts of fun puzzles,but is not actually needed to describe the universe in which we find ourselves.

Monday, February 19, 2024

Asymptotically We'll all be Dead

This will be a long boring post amplifying on my slogan.

I assert that asymptotic theory and asymptotic approximations have nothing useful to contribute to the study of statistics. I therefore reject that vast bul of mathematical statistics as absolutely worthless.

To stress the positive, I think useful work is done with numerical simulations -- monte carlos in which pseudo data are generated with psudo random number generators and extremely specific assumptions about data generating processes, statistics are calculated, then the process is repeated at least 10,000 times and the pseudo experimental distribution is examined. A problem is that computers only understand simple precise instructions. This means that the Monte Carlo results hold only for very specific (clearly false) assumptions about the data generating process. The approach used to deal with this is to make a variety of extremely specific assumptions and consider the set of distributions of the statistic which result.

I think this approach is useful and I think that mathematical statisticians all agree. They all do this. Often there is a long (often difficult) analysis of asymptotics, then the question of whether the results are at all relevant to the data sets which are actually used, then an answer to that question based on monte carlo simulations. This is the approach of the top experts on asymptotics (eg Hal White and PCB Phillips).

I see no point for the section of asymptotic analysis which no one trusts and note that the simulations often show that the asymptotic approximations are not useful at all. I think they are there for show. Simulating is easy (many many people can program a computer to do a simulation). Asymptotic theory is hard. One shows one is smart by doing asymptotic theory which one does not trust and which is not trustworthy. This reminds me of economic theory (most of which I consider totally pointless).

OK so now against asymptotics. I will attempt to explain what is done -- I will use the two simplest examples. In each case, there is an assumed data generating process and a sample size of data (hence N) which one imagines is generated. Then a statistic is estimated (often this is a function of the sample size and an estimate of a parameter of a parametric class of possible data generating processes). The statistic is modified by a function of the sample size (N). The result is a series of random variables (or one could say a series of distributions of random variables). The function of the sample size N is chosen so that the series of random variables converges in distribution to a random variable (convergence in distribution is convergence of the cumulative distribution function at all points where there are no atoms so it is continuous).

One set of examples (usually described differently) are laws of large numbers. A very simple law of large numbers assumes that the data generating process is a series of independent random numbers with identical distributions (iid). It is assumed (in the simplest case) that this distribution has a finite mean and a finite variance. The statisitc is the sample average. as N goes to invinity it converges to a degenerate distribution with all weight on the population average. It is also tru that the sample average converges to a distrubiton whith mean equal to the population mean and variance going to zero - that is for any positive epsilon there is an N1 so large that the variance of the sample mean is less than epsilon (conergence in quadratic rule). Also for any positive epsilon there is an N1 so large that if the sample size N>N1 then the probability that the sample mean is more than epsilon from the population mean is itself less than epsilon (convergence in probability). The problem is that there is no way to know what N! is. In particular, it depends on the underlying distribution. The population variance can be estimated using the sample variance. This is a consistent estimate so that the difference is less than epsilon if N>N2. The problem is that there is no way of knowing what N2 is.

Another very commonly used asymptotic approximation is the central limit theorem. Again I consider a very simple case of an iid random variable with a mean M and a finite variance V.

In that case (sample mean -M)N^0.5 will converge in distribution to a normal with mean zero and variance V. Again there is no way to know what the required N1 is. for some iid sitributions (say binary 1 or zero with probability 0.5 each or uniform from 0 to 1) N1 is quite low and the distribution looks just like a normal distribution for a sample size around 30. For others the distribution is not approximately normal for a sample size of 1,000,000,000.

I have criticisms of asymptotic analysis as such. The main one is that N has not gone to infinity. Also we are not imortal and my not live long enough to collect N1 observations.

Consider an even simpler problem of a series of numbers X_t (not stochastic just determistic numbers). Let's say we are interested in X_1000. What does knowing that the limit of X_t as t goes to infinity is 0 tell us about X_1000 ? Obvioiusly nothing. I can take a series and replace X_1000 with any number at all without changing the limit as t goes to infinity.

This is a very simple example, however there is the exact same problem with actual published asymptotic approximations. The distribution of the statistic for the actual sample size is one element of a very large infinity of possible series of distributions. Equally valid asymptotic analysis can imply completely different assertions about the distribution of the statistic for the actual sample size. As they can't both be valid and they are equally valid, they both have zero validity.

An example. Consider a random walk where x_t = (rho)x_(t-1) + epsilon_t where epsilon is n iid random variable with mean zero and finite variance. There is a standard result that if rho is less than 1 then (rhohat-rho)N^0.5 has a normal distribution. There is a not so standard result that if rho = 1 then (rhohat-rho)N^0.5 goes to a degenerate distribution equal to zero with probability one and (rhohat-rho)N goes to a strange distribution called a unit root distribution (with the expected value of (rhohat-rho)N less than 0.

Once I came late to a lecture on this and casually wrote converges in distribution to a normal with mean zero and variance before noticing that the usual answer was not correct in this case. The professor was the very brilliant Chris Vavanaugh who was one of the first two people to prove the result described below (and was not the first to publish).

Doeg Elmendorf who wasn't even in the class and is very very smart (and later head of the CBO) asked how there can be such a discontinuity at rho +1 when, for a sample of a thousand observations, there is almost no difference in the joint probability distribution or any possible statistic between rho = 1 and rho = 1 - 10^(-100). Prof Cavanaugh said that was his next topic.

The problem is misuse of asymptotics (or according to me use of asymptotics). Note that the question explicity referred to a sample size of 1000 not a sample size going to infinity.

So if rho = 0.999999 = 1 - 1/(a million) then rho^1000 is about 1 but rho^10000000000 iis about zero. taking N to infinity implies that, for a rho very slightly less than one, almost all of the regression coefficients of X_t2 on X_t1 (with t1. Now the same distribution of rhohat for rho = 0.999999 and N = 1000 is the thousandth element of many many series of random variables.

One of them is a series where Rho varies with the sample size N so Rho_N = 1-0.001/N

for N = 1000 rho = 0.999999 so the distribution of rhohat for the sample of 1000 is just the same as before. However the series of random variables (rhohat-rho)N^0.5 does not converge to a normal distribution -- it converges to a degenerate distribution which is 0 with probability 1.

In contrast (rho-rhohat)N converges to a unit root distribution for this completely different series of random variables which has the exact same distribution for the sample size of 1000.

There are two completely different equally valid asymptotic approximations.

So Cavanough decided which to trust by running simulations and said his new asymptotic aproximation worked well for large finite samples and the standard one was totally wrong.

See what happened ? Asymptotic theory did not answer the question at all. The conclusion (which is universally accepted) was based on simulations.

This is standard practice. I promise that mathematical statisticians will go on and on about asymptotics then check whether the approximation is valid using a simulation.

I see no reason not to cut out the asymptotic middle man.