Translate

Monday, April 20, 2020

COVID-19 April 20, 2020

The usual weekend data aberration was in full effect this weekend.  Both new daily cases and daily deaths took a dip.  The daily deaths, in particular, are starting to show a very, very large oscillation around the mean.  Over the last five days, there was a low yesterday of 1631 and a high of 2438 on April 15.  It was 1462 just two days prior to that.  So, there's an oscillation of about +/- 500 around a mean of about 2000, or 25%. That's huge, and it makes it very hard to pin down a curve that best fits the data. The variations in new daily cases are also similar, ranging from the mid-20,000s to the mid-30,000s.

Due to the very large oscillations, I will only be adjusting the best-fit curves roughly weekly.  If I try to adjust them daily, I will end up chasing the oscillating data up and down. For now, the curves fitting the cumulative statistics for total cases and total deaths are reasonably close to the historical data.

Some states, like Florida, are starting to reduce restrictions on social isolation.  It will be interesting to see how this affects the data.  At present New York and New Jersey still account for something like 40% of the numbers across the US.  These two states disproportionately affect the statistics, and the trend from those two stats has been overwhelmingly downward.  If states like Florida began to see an increase in cases and deaths, they could start to bend the curve back upward.  If that happens, it will invalidate the use of a Gompertz model, which assumes a relatively uniform infection process across the population.

It may very well be that to model the US it is necessary to model each state individually and then add the results together to obtain the solution for the country as a whole.  This is far too time consuming for me to undertake, but even if I was able the uncertainty in the models would likely increase beyond utility.  The certainty of statistical calculations often strongly depends on the number of data in the calculation.  As I already showed, the oscillations in the data aggregated across the US are already so large that the curve fitting is becoming problematic.  These oscillations are often worse at the state level and the number of data in each state is much smaller than the country as a whole.







1 comment:

Surfaholic said...

I was going to suggest weekly totals for both confirmed cases and death. We do a couple things in commercial lab:

1. Weekly totals and annualized run rates.
2. Daily averages by week/month/qtr.
3. Weekly change in sample volume


There are just too many variables that can happen for daily counts: inclement weather, logistical issues, lab issues, etc.