Abstract
The world is dealing with one of the worst pandemics ever. SARS-CoV-2 is the etiological agent of COVID-19 that has already spread to more than 200 countries. However, infectivity, severity, and mortality rates do not affect all countries equally. Here we consider 140 HLA alleles and extensively investigate the landscape of 3,723 potential HLA-I A and B restricted SARS-CoV-2-derived antigens and how 37 countries in the world are predicted to respond to those peptides considering their HLA-I distribution frequencies. The clustering of HLA-A and HLA-B allele frequencies partially separates most countries with the lowest number of deaths per million inhabitants from the other countries. We further correlated the patterns of in silico predicted population coverage and epidemiological data. The number of deaths per million inhabitants correlates to the predicted antigen coverage of S and N derived peptides and its module is influenced if a given set of frequent or rare HLA alleles are analyzed in a given population. Moreover, we highlighted a potential risk group carrying HLAs associated with an elevated number of deaths per million inhabitants. In addition, we identified 3 potential antigens bearing at least one amino acid of the 4-length insertion that differentiates SARS-CoV-2 from previous coronavirus strains. We believe these data can contribute to the search for peptides with the potential to be used in vaccine strategies considering the role of herd immunity to hamper the spread of the disease. Importantly, to the best of our knowledge, this work is the first to use a populational approach in association with COVID-19 outcome.
Importance The emergence of COVID-19 outbreak caused by a novel coronavirus opens up an imminent need to better understand how our immune system responds to this new virus and to develop ways to control its spread. Our results suggest why some countries show a higher number of deaths due to COVID-19 while other countries do not. SARS-CoV-2 expresses 10 proteins that could be used as targets for vaccine development. By using a comprehensive bioinformatic screening of potential epitopes derived from the SARS-CoV-2 sequences, we identified potential antigens for 148 HLA-I alleles distributed world-wide. These peptides are likely to have a high affinity for HLA class I molecules and may induce critical immune responses. Our results suggest that different coverages for S and N derived peptides is associated with deaths related to COVID-19 in distinct populations. Of note, frequent and rare HLA alleles influence the effects we observed. We explored these associations regarding potential antigens derived from each viral protein to enumerate a set of protective HLA alleles. Moreover, we explored the novel insertion in the SARS-CoV-2 protein S genome to map 3 potential antigens bearing this new region, and a set of peptides presented by those protective HLA alleles, of interest for vaccine strategies. Finally, we propose that vaccine development strategies should consider the inverse relationships of proteins S and N in view of the associations with the number of deaths.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
There is no fund associated with this study.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The IRB recognizes that the analysis of de-identified, publicly available data does not constitute human subjects research as defined in federal regulations, and that it does not require IRB review.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Results were filtered by TAP transportation and proteasome cleavage. We added the analysis using the cumulative allelic frequency of 0.9. Some alleles were described as specialist or generalist for some viral proteins. At the end we proposed some peptides derived from S protein that would be interesting for vaccine development taken into account our findings.
Data Availability
Data will be available after final publication.