Translate

Saturday, April 11, 2020

COVID-19 April 11, 2020

It's the weekend!  I expect bogus numbers to flow in over the next couple of days based on history.  Today, I thought I'd focus on some highly speculative projections.

**********WARNING -- HIGHLY SPECULATIVE *****************

See the warning above?  Read it.  

The exponential models no longer fit any of the data.  In particular, the data have fallen below the exponential model predictions.  This is what we hoped, and is without a doubt due to social distancing and other measures that have been put in place with the exception of a few states.

Instead of an exponential function, I have been using a Gompertz model.  This model seems to do reasonably well for where we are in the overall evolution of the pandemic.  I expect that it may breakdown as things start to wind down.  Still, the Gompertz model can provide a bit more insight than an exponential model that already deviates substantially from observations.

The Gompertz model has three adjustable parameters (see my previous posts for an explanation).  The first parameter is the initial number of infections.  This number and the day on which it happened is set to 15 on the 57th day of the year (February 27, 2020) and is based on data.  The second parameter is the growth rate of infections.  The third and final number is the final number of infections at an infinite time later.  I adjust the second and third parameter to obtain a good fit to the observed data as judged by eye.  I can then extrapolate the Gompertz function out into the future and see what it predicts.

There are a couple of things to consider before looking at the results.  First, the observational data has problems. I've pointed these out multiple times and previously discussed these in detail.  Since the observational data is of poor quality, even a perfect model will not be able to match the data exactly.  Thus, there are inherent errors in fitting the model to the imperfect data.  Second, the Gompertz model assumes a natural progression of infection, whereas we know that we (humans) threw a wrench in the natural progression by instantiating social distancing programs.  We changed the calculus, and thus the model may not be able to fit both the early observational data (before social distancing effects) and later data (after social distancing effects).  To predict the future, it might be better to emphasize a good fit to the most current data rather than the oldest data from early in the pandemic.

Now, let's get to the plots and the predictions.  The first two figures below show the cumulative number of cases.  The data are the same, but one is plotted on a semi-log plot and the other using linear axes.  The blue dots are observations. The dashed orange line is the 10-day trend based on an exponential.  The red crosses are the data obtained from the Gompertz model.  

The Gompertz model overpredicted the observations early in the pandemic; the red crosses are above the blue dots.  The Gompertz model actually predicted that infections would grow faster than they actually did.  But, we must also consider that the number of observed infections may be biased to the low side due to the clusterf*ck of testing.  From about day 85 (March 26) and onward, the fit looks pretty good.  

If I look at the parameters from the Gompertz model I get a total number of infections of 3,200,000.  According to the model, we reach this number at time equals infinity, which is not entirely helpful.  Instead, consider that the model does give a specific date for when we reach the halfway point of 1,600,000.  That happens on May 3.  We reach 2,000,000 on May 11.  For documentation purposes, the growth rate is found to be 0.044.  

Recent press reports have quoted professional epidemiological predictions (n.b. that's definitely not me!) that say the US has reached its peak.  My model says otherwise.  As a meteorologist, I'm used to going out on a limb with predictions.  And, I'm also used to be wrong!  I'm also used to being right!  I'm sure to be either right or wrong, that I'm sure of.  Time will be the judge.

Before you run off with the numbers above, please go back and read the warning near the top of this post.






I can also use the Gompertz model to look at the daily number of infections.  This is shown in the two plots below.  From the model, I find that the peak number of new daily infections will be 51,781 on April 24.  For reference, we are at about 40,000 cases per day right now.  According to the model, we will still be at over 20,000 cases per day by the end of May.

You can really see the noise in the observational data in these plots, particularly when using a linear scale.  The green circles highlight dips that seem to appear over the weekend.  Does the model (red crosses) fit these data?  Not perfectly, that's for sure, but it does more or less go right down the middle.






I can also fit a Gompertz model to observed death data.  Cumulative deaths and the daily death rates are shown below.  Deaths should be more accurate than the data on infections, because the dead are almost always reported.  Yes, some deaths may slip through the cracks because those deaths can sometimes be attributed to other causes.  It also turns out that in some cases deaths don't hit the books until well after they occurred.  So, while the deaths are recorded, they can show up well after they happened and can bias the data.  







As with the infection cases, the observational data departed from the exponential model quite a while ago.  The Gompertz model follows the observed data much better.  The best fit parameters show a total death of 320,000 after infinite time.  We will reach 160,000 on May 7.  The peak death rate will be on May 4 with 5179 deaths per day.  Again, these numbers are in conflict with at least some of the pros that say we have reached (or will very shortly) reached a peak in deaths.  

Going forward, I'll be showing these plots and will mention and major changes in parameters and predictions.  





No comments: