Translate

Saturday, May 9, 2020

COVID-19 May 09, 2020


For the past few days, I've had little time to do more than post daily plots.  Let's catch up on where things stand, starting with infection cases and then moving to the death statistics.

The number of new cases each day is shown below.  Note that I have added a 7-day running average (orange) to help smooth out the oscillations with a period of 7 days.  Per numerous previous posts, the data followed the Gompertz model (red) right up until about the peak in early April.  Then, rather than starting to slowly decrease, the number of daily cases remained more or less constant.

In a post about one week ago I discussed how the increase in testing could partly explain the constancy of the new daily cases.  In short, the more testing you have, the more new cases you might expect to find.  A normalization of the data taking into account the change in testing suggested that at least some of the constancy was an artifact of the testing effect.



If you look at the data since around April 22, the 7-day average has been very slowly decreasing.  I cannot say whether this is statistically significant.  If it is, it's probably on the hairy edge of significant, because it's clear that there are other past variations that are of nearly comparable magnitude.  Don't put too much (or any) faith in the trend.

Whether decreasing or remaining constant, it is clear that the number of new infections is not increasing despite an increase in testing.  This fact may be helpful because if we start to see an increase without a corresponding increase in additional testing it might suggest that infection rates really are increasing.  This is something to keep an eye on as more and more states open up and the stupid, gullible, misinformed, and conspiracy loons are left to their own devices.

Since the number of new daily cases has remained nearly constant since early April it should be no surprise that the total number of cases has been increasing almost linearly at the rate corresponding to the orange 7-day average in the daily new cases plot. The slope of the line in the total daily cases below is exactly equal to the average number of daily new cases, which currently stands at 24,481 cases per day on average over the last 7 days. (See Algebra is useful.  Calculus, which relates slopes to curves is also useful!)


I have continued to leave the Gompertz model on the cumulative case plot even though it is clearly of little utility other than as a comparison to a by-gone era.  Had we stayed on the Gompertz model, we'd still be below 1.2M cases.  We've blown through that with cases increasing by 20-some thousand every day (or over 100,000 per week!).  At least it's a linear increase and not exponential.  At least not yet.

Now let's turn to the death statistics. These, in principle, should be more robust against influences like testing.  The dead are almost always reported, but the cause of death is not always properly categorized.  It is now coming to light that deaths early in the pandemic were falsely attributed to things like pneumonia or influenza.  Post-mortem testing has revealed that many of these were actual COVID-19-related.  Also, those that die at home from the coronavirus are not always properly added to the statistics.  Some states require testing before those dead are added to the total.  That process can take days to weeks.  Further, once a cause of death is confirmed as COVID-19, that is usually recorded on or after the testing date not when the death occurred.  Bottom line: deaths should be more robust than the case numbers, but it is still a messy metric.

As just a couple of examples of the problems with the death statistics, I bring you New York and Pennsylvania.  On May 6, New York added 720 deaths on top of the 232 deaths reported for that day.  Where did the 720 come from?  It was an accumulation of prior deaths that had not been properly accounted for either due to misdiagnosed causes of deaths or deaths that had simply slipped through the reporting cracks.  So, not only does this adjustment screw up the historical data, it causes a large, spurious bump for New York (and to some extent for the whole US) on May 6.  On the other hand, Pennsylvania removed 201 deaths on April 23. These deaths were reclassified as probable cases that needed further investigation in order to be confirmed.

The number of daily deaths is below, including all the idiosyncracies like those mentioned above PLUS the vagueries of the weekly reporting oscillation.  I have also added a 7-day running average to try and at least smooth out the oscillations and some of the noise.  You'll note that, like the daily cases, the number of daily deaths followed the Gompertz model rather nicely up until about April 15.  At that point, it went approximately constant.  I find this interesting because that breakpoint lags the new cases by approximately 9 days: New cases went constant (on average) on about April 6 and the number of daily deaths went flat (on average) on about April 15.  That lag is approximately what you might expect between the time of diagnosis at a hospital (i.e, the case is severe enough to warrant an actual test) and when the patient dies.

The lagged correlation between daily deaths and daily new infections suggests that the flattening of both has some basis in reality and is not entirely due to testing.  None of this is a sure thing--far from it.  I'm just trying to infer what I can out of very messy data and statistics.  Do not take any of this as being anything close to definitive.  I can say that none of this would pass peer-review muster in a reputable scientific journal.  It's not that it's necessarily wrong, it's just that a much deeper and careful analysis of the raw data and the statistics are needed.  I don't have the time (or the inclination) for that.  I am rather sure that many future epidemiological Ph.D. theses will spring out of these data.

Finally, the cumulative number of deaths is shown below.  For now, the data is still following the Gompertz curve. The data had sagged slightly below the model for the last week or so.  This is consistent with the number of daily deaths falling below the modeled number, as seen in the daily death plot.  However, with the daily deaths now trending toward a constant, the data has caught up to the model, and may very well exceed the model.  The Gompertz model predicts a continued fall in deaths and, therefore, a slowing in the rate at which deaths accumulate.  The next week should be the truthteller.

If deaths do not continue to fall that suggests that the pandemic has indeed plateaued and is no longer decreasing.  After all the social distancing (and even give the handful renegade states and individual miscreants), the best we've been able to achieve is a plateau, not a decrease.  We've flattened the curve to a constant rate of increase rather than exponential.  What do you suppose will happen as restrictions are lifted?  It's not hard to make an educated guess at the answer to this question.  Ultimately, the death rate will let us know.  

Unfortunately, there is a substantial lag between deaths and policy.  I've likened that lag to the light travel from distant stars.  When we see a star in the night sky, we are seeing that star as it was in the past when the light first left the star's surface.  That could be 100s or 1000s or millions of years ago!  It tells us nothing about what the star is doing now.  Death statistics are a time machine that tells us about reality a week or two weeks or perhaps three or more weeks ago.  People that die today are the result of the policies and the pandemic state put in place prior to them first becoming infected.  It takes time before changes in policy and the pandemic show up in the death statistics.  But they will.









No comments: