Making Sense of OC COVID-19 Data

I’m reading Camus’s The Plague. I read it before, years ago. I didn’t really relate to it. It seemed more like a historical curiosity with a few themes worthy of wider philosophical contemplation. This time around it reads more like a procedural drama. Details that I skipped over and completely forgot the first time glow with significance. For example, this passage:

The population of the town was about two hundred thousand. There was no knowing if the present death-rate were really so abnormal. This is, in fact, the kind of statistics that nobody ever troubles much about — notwithstanding that its interest is obvious. The public lacked, in short, standards of comparison.

It’s an ignorance I’ve been feeling myself with the COVID-19 outbreak, especially as the statistics relate to testing. The big numbers, case numbers and fatality rate, are absolutely dependent on this figure. The entire policy debate over maintaining lockdown or reopening the economy would seem to hinge on our testing capacity. And yet it seems to get only passing attention in most popular media and public discourse.

I’ve tried to understand where we are in the progress of the epidemic — getting better? getting worse? ready to reopen dine-in restaurants or gyms or my company’s office? — by focusing on the numbers for my home, Orange County, CA. They are published by the local public health agency, Orange County Health Agency (OCHA). They do include testing data. They provide a dashboard here:

I find the dashboard inadequate for making sense of the epidemic in any meaningful way because it lacks context in the way that Camus noted. My gripes:

  • OCHA does not make it easy to export this data. I cannot find a CSV to download. The data tables are hidden in modals. To work with the data, I’ve had to manually enter it myself. And then re-enter as data continues to get revised days and weeks after it was first posted.
  • Case numbers are not correlated with testing. If case numbers increase steeply but test numbers are increasing even more steeply, that’s probably good news. This dashboard, however, would not give you that impression.
  • The graphs provide some sense of the progress of the epidemic, but it is still hard to make sense of what it means.
  • The data gives no sense of the rate of infection. Based on reported numbers, how many people in Orange County would I need to encounter at random before I might expect to cross paths with someone shedding the virus?

I’ve tried to provide some basic context by creating my own Google Sheet with tables and graphs here:

What I’ve tried to do in particular is provide a greater sense of the important relationship between testing and results. I’ve also focused on 7-days averages in order to smooth out reporting anomalies. Finally, I’ve tried to tie results in a more meaningful way to actual public health benchmarks.

Insights I believe this simple reframing has provided:

  • New cases, in either gross or relative terms, are not declining. I imagine many people in OC would be surprised to find that case numbers on May 20th were up around 7% over the previous two weeks, or 3.5% even when you control for additional testing.
  • Testing is inadequate and is actually in decline over the last two weeks.
  • OC is no where near the minimum 2% daily testing threshold recommended by the Harvard Safra Center roadmap.
  • New cases in OC have yet to peak even when controlling for testing.

Questions that remain:

  • What the hell is going on with testing? Why are test counts in Orange County down? Is it the lag in reporting?
  • Does the warmer weather suppress the virus? The numbers don’t show that. Is that because laxer social distancing has offset the seasonal climatic effects?
  • What do the testing numbers mean? Is a new case dated according to when the test was taken or when the result was available? Is this just viral tests or does it include antibody tests?
  • How many people out there are really contagious? What is the risk in going to the store? Eating out at a restaurant? Marching in a protest? Going to church?

One final big question: how has Orange County done in managing the outbreak? Has it done a good job? One way of looking at it: No. New cases continue to grow. Testing is inadequate. Is there a county-level plan for managing the outbreak and ensuring public safety?

But an alternate take might be: Sure, new cases are growing. But the numbers have remained relatively small. We never turned the curve, but it has remained flat from the start. Our hospitals, from what I understand, have not be overwhelmed. Orange County, collectively, has done OK.

This is consistent with the lesson offered by that widely circulated Washington Post simulation:

Orange County thus far looks like one of the last two social distancing scenarios. What’s interesting to note with the last one in particular, “extensive social distancing”, is the curve peaks much later and in some cases never peaks at all. Maybe that’s what we’re seeing here.

What I fear is that rather than being now on the back slope of the outbreak, we’re on a plateau or even a second up-slope. Due to increased social interaction, are we moving from “extensive social distancing” to “moderate social distancing”? If so, this means case rates could hold steady or even climb over the summer and we will nurse the virus at a time when it might be squashed. Then, in the fall, we see a new explosion of case and perhaps Orange County ends up looking like denser urban centers that got hammered earlier this past winter during the first wave.

