一篇非同行审查的论文在互联网上起飞… 然后来到了批评者。
【宣称】
2020 年斯特吉斯摩托车集会导致 25 万名 COVID-19 冠状病毒病例。
【结论】
未经证实
【原文】
In September 2020, social media was abuzz over a report from the IZA Institute of Labor Economics that linked 266,796 COVID-19 coronavirus cases (a figure that was reported as “more than 250,000” in various headlines) to the Sturgis motorcycle rally held in Sturgis, South Dakota:
IZA Institute of Labor Economics truly did publish a paper estimating that the rally was linked to a surge of approximately 250,000 COVID-19 cases (representing a cost of $12.2 billion). However, while the rally likely contributed to a rise in coronavirus cases, the figures stated here are estimates from a non-peer reviewed paper and have not been demonstrated definitively. Furthermore, various statisticians and epidemiologists have indicated the study had some flaws.
Before we get to the expert opinions on this study, let’s dispel a few quick rumors on social media. This study did not claim, for instance, that 250,000 people tested positive for COVID-19 shortly after attending the rally. The research attempted to quantify how many cases of COVID-19 could potentially be linked to people who attended the rally, traveled to other locations, and then spread the disease among their communities.
It should also be noted that this is an estimate based on a wide variety of factors, not an actual headcount of COVID-19 patients who attended, or knew someone who attended, the rally. As mentioned above, this study was not peer-reviewed and was prefaced with a piece of text noting that “IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion.”
The IZA paper’s finding that 250,000 COVID-19 cases were linked to the Sturgis rally was based on three key factors: anonymized smartphone data that showed an influx of out-of-state visitors and a sharp increase in foot traffic at “restaurants and bars, hotels, entertainment venues, and retail establishments”; a decrease in stay-at-home activity in the surrounding area; and Centers for Disease Control and Prevention (CDC) data that showed COVID-19 cases increased both in South Dakota where the rally was held, and in areas where Sturgis attendees traveled to in the days after the rally.
While this study may provide a broad estimate on how Sturgis could have impacted the COVID-19 pandemic, a number of epidemiologists and statisticians have taken issue with models used in the study and the report’s findings.
Joshua Clayton, South Dakota’s state epidemiologist, said that the study’s findings did “not align with what we know” and argued that IZA did not account for other contributing factors, such as the fact that schools reopened around the same time as the rally.
Local news outlet KEVN reported:
“From what we know the results do not align with what we know,” state epidemiologist Joshua Clayton said.
He mentioned that a white paper isn’t peer-reviewed. And pointed out the paper doesn’t note schools in the state also reopened close after the Rally ended, which could have attributed to the surge of cases in South Dakota.
Rex Douglas, the director of the Machine Learning for Social Science Lab (MSSL), Center for Peace and Security Studies, University of California San Diego, and Kevin Griffin, an assistant professor at the Vanderbilt School of Medicine, also took issue with the methodology used in this paper. Griffin, for instance, noted that cases were already on the rise when the rally took place, while Douglas noted that authorities simply don’t have the data to reach such a precise conclusion.
Douglas wrote:
They want to know if mass-events (protests, conventions, rallies) spread covid. But we don’t have individual level data on attendees and comparable stay-homes. So they resort to a diff-in-diff, looking to see if a place has more, less, or the same number of confirmed cases soon after an event than they ‘should.’ The argument is that the trend line for an entire location after time T can tell us if what happened on T is safe or risky.
For why this research design does not answer that question, imagine running your own experiment. Go outside and cough in a stranger’s face right now. Now if next week your county’s confirmed case rate goes up, that’s bad behavior, stays the same it’s ok, and goes down it’s good!
Jennifer Beam Dowd, the deputy director of the Leverhulme Centre for Demographic Science at the University of Oxford, also took issue with the paper’s conclusion in an article published on Slate. Generally speaking, Dowd argued that the researchers made assumptions that don’t always play out in reality. More specifically, Dowd took issue with how the study confidently presented a precise conclusion (266,796 COVID-cases) despite noisy results.
The 266,796 number also overstates the precision of the estimates in the paper even if the model is taken at face value. The confidence intervals for the “high inflow” counties seem to include zero (meaning the authors can’t say with statistical confidence that there was any difference in infections across counties due to the rally). No standard errors (measures of the variability around the estimate) are provided for the main regression results, and many of the p-values for key results are not statistically significant at conventional levels. So even if one believes the design and assumptions, the results are very “noisy” and subject to caveats that don’t merit the broadcasting of the highly specific 266,796 figure with confidence, though I imagine that “somewhere between zero and 450,000 infections” would not have been as headline-grabbing.
The claim that 250,000 COVID-19 cases were linked to Sturgis is based on one study’s estimate of how the motorcycle rally could have impacted the pandemic. As several statisticians and epidemiologists have noted, the models used for this study contained flaws, and the report arrived at a conclusion that was more precise than the available data would have allowed.