Triage entered the lexicon of the coronavirus era last Friday, when President Donald Trump claimed that Google is working on a tool to determine who’s eligible for testing. In this case, only people who display serious symptoms will be able to get a test for the novel coronavirus. But triage—or targeted—testing is the wrong approach at this early stage of the fast-breaking public health emergency. In order to successfully execute triage testing, the government would need a strong data set that can accurately select who should and should not receive a coronavirus test, which, as a result of the failed testing efforts thus far, the US government does not currently have. Worse, this strategy distorts the collection of data, which may lead to future mistakes in policymaking.
Kaiser Fung has been the data science lead at various companies, including SiriusXM, Vimeo, and American Express. He is the author of Numbers Rule Your World and blogs at Junk Charts.
At an early stage of an epidemic, targeted testing leads to an uneasy trade-off. As a former data scientist for large companies, I tend to think of it in a business context. In my past, I have developed targeting strategies that allow marketers to select which customers will receive special offers, such as “buy one, get one free.” Instead of simply offering the deal to everyone, we identified a subset of customers who were deemed most likely to spend more as a result of the discount. The goal is to send the right offer to the right people at the right time. This targeting strategy reduces the volume of marketing offers, leading to fewer “false alarms,” or promotions delivered to customers who don’t need the extra motivation, resulting in the lousy paradox of selling more while earning less. But targeted offers also come at the expense of missed sales. Our predictive system—even when informed by decades of data—is not perfect, and some of the people excluded from offers would have welcomed them.
But while this might be an effective strategy for BOGOs, such triage doesn’t make sense for public health. To put it bluntly, triage reduces the volume of testing and leads to fewer “false alarms,” or uninfected people recommended for testing. But it also comes at the expense of increasing the likelihood that infected people are skipped over. A false alarm, or someone who isn’t infected but is deemed high risk and told to test, does not happen without a test recommendation. If fewer people qualify for coronavirus testing, fewer uninfected will take the test. But more exclusions also means a higher number of infected individuals will not be considered eligible for testing. When the goal is containment of transmission, a false alarm is less harmful than missing someone who is harboring the virus. After all, missed positives are infected people who’ve been told they don’t require testing, and may therefore unknowingly spread the virus.
Triage testing works—in the sense of administering the right test to the right people—only if the underlying method used to predict who should be tested is highly accurate. Given how much we still don’t know about this new virus, and our inability to track its spread, it’s not possible to reliably make such predictions. Instead, the present targeting strategy creates a false sense of security. People who are told they don’t need a test believe they are uninfected. Communities with low counts of positive tests are assumed safe havens when, in reality, ineffective triage testing has missed infected people. This statistical fallacy is sometimes described as “Absence of evidence is not evidence of absence.” All the models used to predict the spread of this disease are trained on historical data, the information that we’ve gathered on infections to this point. These models are then updated continuously as new data arrive. At the start of any epidemic, such data are sparse, and the models weak. But the present situation is worse than that: We’re not conducting a lot of tests, overall, and the ones we are conducting—on triaged subgroups of the population—deliver biased data. That means the initial models used to decide who gets tested are bound to be highly inaccurate.
Worse, people who are deemed “safe” by the current predictive model—those who have not traveled, have not knowingly been in direct contact with someone who’s already tested positive, or are not experiencing severe symptoms themselves—will not be tested, and thus they will continue to be excluded from the data set altogether. Turning back to the business example, if the marketer sent offers only to female shoppers, all new entries in the database would come from women, making it virtually impossible for men to be selected for future deals, even if the company launches a new line of menswear.
When only people with serious symptoms are tested, the predictive model becomes blind to those who have no or mild symptoms. But as we’ve learned, serious symptoms are not required for community spread of Covid-19. In fact, doctors have said that this novel coronavirus becomes highly contagious early in the incubation period, sometimes even before the patient exhibits any symptoms. There is also ample evidence from other countries, especially South Korea, that aggressive and extensive testing allows clusters of cases to be identified, making it easier to contain spread to other communities and mitigating the need for nationwide lockdowns. The distortion of the data in the US will only lead to further policy errors.
In other words, the world faces a classic explore-versus-exploit tradeoff. A targeted strategy favors exploitation: It assumes confidence that we have enough knowledge to take decisive actions now. A broad-based testing regime, on the other hand, emphasizes exploration: It is more appropriate when we are still uncertain about the information we have.
In my previous jobs, my marketing colleagues worried that our predictive model was too selective, leaving too much money on the table. The only way to alleviate this concern was to send promotions to a small group of customers our targeting model did not recommend. We could easily observe whether customers who received the offers made use of the discounts, but we couldn’t know whether our predictive model was making the incorrect exclusions, since those it did exclude didn’t receive the promotion. It’s impossible to determine how much someone would have purchased if we had sent him or her a discount—would they have bought the same or more? Similarly, in the case of testing for Covid-19, if we never test someone, we will never know whether they’re a carrier, unless, of course, the individual later becomes very sick. Widening the scope of testing allows the medical community to produce the control data, track progress, and make course corrections. In an ongoing operation, we should tune the degree of exploration to react to the state of our knowledge. With so much unknown, exploring should be prioritized. This means broader testing.
WIRED Opinion publishes articles by outside contributors representing a wide range of viewpoints. Read more opinions here. Submit an op-ed at firstname.lastname@example.org.
More From WIRED on Covid-19