Abstract
Background Forecasts and alternative scenarios of the COVID-19 pandemic have been critical inputs into a range of important decisions by healthcare providers, local and national government agencies and international organizations and actors. Hundreds of COVID-19 models have been released. Decision-makers need information about the predictive performance of these models to help select which ones should be used to guide decision-making.
Methods We identified 383 published or publicly released COVID-19 forecasting models. Only seven models met the inclusion criteria of: estimating for five or more countries, providing regular updates, forecasting at least 4 weeks from the model release date, estimating mortality, and providing date-versioned sets of previously estimated forecasts. These models included those produced by: a team at MIT (Delphi), Youyang Gu (YYG), the Los Alamos National Laboratory (LANL), Imperial College London (Imperial) the USC Data Science Lab (SIKJalpha), and three models produced by the Institute for Health Metrics and Evaluation (IHME). For each of these models, we examined the median absolute percent error—compared to subsequently observed trends—for weekly and cumulative death forecasts. Errors were stratified by weeks of extrapolation, world region, and month of model estimation. For locations with epidemics showing a clear peak, each model’s accuracy was also evaluated in predicting the timing of peak daily mortality.
Results Across models, the median absolute percent error (MAPE) on cumulative deaths for models released in June rose with increased weeks of extrapolation, from 2.3% at one week to 32.6% at ten weeks. Globally, ten-week MAPE values were lowest for IHME-MS-SEIR (20.3%) and YYG (22.1). Across models, MAPE at six weeks were the highest in Sub-Saharan Africa (55.6%), and the lowest in high-income countries (7.7%). Median absolute errors (MAE) for peak timing also rose with increased forecasting weeks, from 14 days at one week to 30 days at eight weeks. Peak timing MAE at eight weeks ranged from 24 days for the IHME Curve Fit model, to 48 days for LANL.
Interpretation Five of the models, from IHME, YYG, Delphi, SIKJalpha and LANL, had less than 20% MAPE at six weeks. Despite the complexities of modelling human behavioural responses and government interventions related to COVID-19, predictions among these better-performing models were surprisingly accurate. Forecasts and alternative scenarios can be a useful input to decision-makers, although users should be aware of increasing errors with a greater amount of extrapolation time, and corresponding steadily widening uncertainty intervals further in the future. The framework and publicly available codebase presented can be routinely used to evaluate the performance of all publicly released models meeting inclusion criteria in the future, and compare current model predictions.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was primarily supported by the Bill & Melinda Gates Foundation. J.F. received support from the UCLA Medical Scientist Training program (NIH NIGMS training grant GM008042).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This research was deemed exempt from review by the University of Washington Institutional Review Board.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Paper in collection COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.