Human Leukocyte Antigen (HLA) Class I Susceptible Alleles Against COVID-19 Increase Both Infection and Severity Rate

Introduction Each country's difference in the severity rate of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) may be explained by the difference in human leukocyte antigen (HLA) class I molecules, which affects the reactivity of cytotoxic T lymphocyte (CTL). Methods To clarify the relationship between HLA class I and the severity rate, the binding repertoires of each HLA class I allele to SARS-CoV-2 peptides and the allele frequencies of HLA-A, -B, and -A/B haplotypes in each country were quoted. Results HLA-A1 and the number of deaths per million population (severity rate) in each country had an exponential approximation correlation with correlation coefficient R=0.4879. In addition, the correlation between the infected cases per million (infection rate) and the severity rate was linearly approximated, with R=0.7422. Weak HLA-A alleles with a repertoire of under 300 also had an exponential approximation correlation with the severity rate (R=0.5972), whereas there was a linear approximation with the infection rate (R=0.6808). Weak HLA-B alleles of 30 repertoires or less had no correlation with the severity rate (R=-0.1530). The weak HLA-A/B haplotype has a stronger effect on the severity rate than the weak HLA-A alone. Therefore, the simple HLA class I susceptibility index was calculated, and a strong correlation (R=0.7388) of an exponential approximation with the severity rate was obtained. Conclusions HLA class I susceptible alleles against COVID-19 increase both infection and severity rate. The weak HLA-A is a major factor of severity rate, whereas the weak -B alone has no correlation. However, the weak HLA-A/B haplotype has a stronger effect on the severity rate than the weak -A alone.


Introduction
In viral infections, human leukocyte antigen (HLA) class I plays an important role in their severity and progression. With the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-1) virus, Taiwanese researchers have shown HLA-B46 to cause severe symptoms [1]. In human immunodeficiency virus (HIV) epitopes, HLA class I molecules show different reactivity of cytotoxic T lymphocyte (CTL), and it is known that HLA with low reactivity poses a high risk of acquired immunodeficiency syndrome (AIDS) becoming severe [2]. The mechanism by which class I molecules, such as HLA-A, -B, and -C, are involved in the aggravation of viral infections is the viral antigen-presenting part of CTL in cell-mediated immunity. HLA class I carries antigen-presenting peptides (viral epitopes) and activates CTL. The number of binding repertoires of fractionated viral epitopes differ greatly depending on the type of virus and each HLA type and are factors that determine the immunocompetence of CTL. The author speculated that the differences in HLA class I molecules may explain the high severity of the SARS-CoV-2 virus in Europe and the United States and the low severity of the disease in East Asia.

Materials And Methods
It is difficult to clarify the rate of serious cases infected by SARS-CoV-2 virus in each country, and as a hypothesis, in a country where the number of infected cases has spread to some extent, the number of deaths per million population is considered to correlate well with the rate of serious cases (severity rate).
Forty-three countries were targeted in the northern hemisphere within the first 50 places in the world's nominal gross domestic product (GDP) ranking in 2018 to unify the natural environmental conditions of each country. As of data on May 14, 2020, the author calculated the number of deaths and infected cases per million population by the latest population statistics in each country. Countries under 100 infected cases per million population (Vietnam, Taiwan, Nigeria, Thailand, India, China) were also excluded in the research to judge that degree of infection has not progressed to some extent in those countries.
Looking at the frequencies of HLA class I molecules in Europe, North America, and East Asia, the biggest difference was the HLA-A1 antigen. The frequency of HLA-A1 in each country was investigated from the "Allele Frequency Net Database" (http://www.allelefrequencies.net/hla.asp). The frequency of the HLA-A*01 allele was examined, but in many countries, the frequency of multiple studies was published and the sample with the largest number and universality was used as the representative value of the country. The frequency of the United States was listed by race, but Caucasian, which has the largest number of samples and is the largest population, was the representative value. There are three countries (Canada, Denmark, Egypt) where the frequency of the HLA-A * 01 allele does not appear. Excluding them, there are a total of 34 countries (  To clarify the relationship between HLA class I and severity rate, the author downloaded raw data from a theoretical calculation paper of the binding repertoires of each HLA class I allele to SARS-CoV-2 peptides [3]. The alleles of a low number of binding repertoires in the HLA-A, -B, and -A/B haplotypes were investigated for the correlation between allele frequencies and severity rates in each country.

Results
The correlation coefficient R=0.4879 and P=0.0034 between HLA-A1 and the number of deaths per million were found to be moderate and significant in an exponential approximation correlation ( Figure 1).

FIGURE 1: Correlation between the A*01 allele and severity rate
In addition, the correlation between the infected cases (infection rate) and the deaths (severity rate) per million was linearly approximated, showing a strong positive correlation with R=0.7422 ( Figure 2). It shows that the higher the infection rate, the higher the severity rate.

FIGURE 2: Correlation between infected cases and deaths per million population
The number of binding repertoires of HLA-A1 is 183 in HLA-A01: 01 and 126 in HLA-A01: 02, and the average of all HLA-A alleles is 498, so the binding repertoire of HLA-A1 is small. The frequencies of binding repertoires of HLA-A alleles under 300 were counted and the correlation with deaths per million population in each country calculated. There are nine countries (Sweden, Netherland, Turkey, Norway, Israel, Singapore, Malaysia, Pakistan, Bangladesh) where the frequency of HLA-A alleles also does not appear. Excluding them, there are a total of 25 countries (

TABLE 2: Frequencies of HLA-A alleles on the number of repertoires under 300 of SARS-CoV-2 peptides
In HLA-A, alleles with a combined repertoire of under 300 and the deaths per million are in an exponential approximated correlation, with the correlation coefficient R=0.5972 (P=0.001619), increasing as compared to HLA-A1 ( Figure 3). The severity rate is found to exponentially correlate with a decrease in the number of repertoires binding to SARS-CoV-2 viral peptides. Correlations were observed at a frequency of HLA-A with a binding repertoire number under 200, but a correlation coefficient under 300 was better.

FIGURE 3: Correlation between total allele frequencies under 300 repertoires of SARS-CoV-2 peptides and deaths per million population
Looking at the correlation between the HLA-A allele under 300 and infected cases per million population, a linear approximation was made, and the correlation coefficient (R=0.6808) was higher than the deaths per million population (Figure 4). The infection rate is found to linearly correlate with repertoires under 300 binding to HLA-A SARS-CoV-2 viral peptides.

FIGURE 4: Correlation between total allele frequencies under 300 repertoires of SARS-CoV-2 peptides and infected cases per million population
The number of HLA-B binding repertoires SARS-CoV-2 peptides was low, with an average of 209 and a median value of 125. There were 17 genes of 30 repertoires or less, and the least allele was B46: 01 at three repertoires. The total frequency of 17 alleles with 30 repertoires or less was examined for whether the frequency correlates with the number of deaths per million population (

TABLE 3: Frequencies of HLA-B alleles on the number of repertoires of 30 or less SARS-CoV-2 peptides
If the HLA-A allele alone is associated with the severity rate in HLA class I, it cannot explain the facts that Africans had a high severity rate in the U.S. and the United Kingdom. The frequency in the United States in this study is represented by Caucasians, which is the largest population, and the total frequency of the HLA-A allele under 300 was 26.66%, but the frequency of Africans was rather low, at 21.76%. As a hypothesis, it is considered that the HLA-A/B haplotype of low binding sites in each plays a more important role in the severity rate. The frequencies of haplotypes determined that the HLA-A alleles (under 300) and HLA-B (30 or less) of Caucasians and Africans were 0.788 and 7.41%, respectively ( Table 4 and Table 5).

less) of Africans in the United States
Africans were 9.4 times on the frequency of the haplotypes as compared to Caucasians, so these haplotypes may work synergistically stronger on severity than HLA-A under 300 alone.
In fact, each weak HLA-A has its own susceptibility index, and so would the weak HLA-A/B haplotype. It is assumed that the integral value of those indices correlates best with the severity rate. Here, the total frequency of HLA-A alleles with a repertoire number of under 300 is the first index. The frequency of the weak HLA-A/B haplotype composed by HLA-A under 300 and HLA-B of 30 or less is the second index. Assuming that the weak haplotypes act synergistically, the value obtained by multiplying the second index by the coefficient and the first index was considered the HLA class I susceptibility index.
The weak haplotype frequencies in each country were obtained from 11 countries ( Table 6). In the weak HLA-A/B haplotype, Italy had an outstanding high of 15.38%.  In the HLA class I susceptibility index, when the value with the highest correlation coefficient was calculated, the coefficient was 1.2. The HLA class I susceptibility index of each country and the number of deaths per million population were in an exponential approximate correlation, and the correlation coefficient shows a strong correlation with R=0.7388 (P=0.009396), and the exponential regression equation of y = 0.1828e 0.2057x was obtained ( Figure 5). When the estimated number of deaths per million Americans in the United States was calculated from the exponential regression equation obtained here, African deaths were 1.87 times that of Caucasians. HLA class I was involved as one of the causes of the high severity rate of Africans in the United States.

Discussion
Viruses are pathogens that enter cells and proliferate, and CTL, which recognizes fragmented viral epitopes, plays an important role in disease regulation. In HLA class I, HLA-A alleles were the major susceptibility and severity factor as compared to -B in this study. The average number of repertoires in HLA-A against SARS-CoV-2 peptides is 2.5 times higher than that of -B, so the role of CTL activity is correspondingly large, which may be a major reason.
In a report on the SARS-CoV-1 virus in Taiwan [1], HLA-B46:01 has a large number of infected people and significantly more severely ill patients (ventilator-wearers and fatalities). HLA-B46:01 is also the SARS-CoV-2 allele with the lowest number of binding repertoires [3], but HLA-B alone was not directly correlated to severity in this study. Considering the report from Taiwan, the HLA-B46:01 frequency was 16 (21.6%) in the 74 alleles with 37 cases of SARS-CoV-1 infection and suspected infection, but the frequency in HLA-B46:01 was originally 18.2%. Unlike the content of the paper, no significant difference in infection rate was apparent. However, five of the six patients with severe disease had a significantly higher allele frequency. Looking at HLA-A in those patients, the A24:02 allele was 21.6%, which had a relatively low repertoire number of 329 in SARS-CoV-2 peptides. HLA-A24:02 was positive in four out of six patients with severe illness and three out of four cases had a haplotype with HLA-B46:01. Since the frequency of the haplotype is 1.1%, it is considered that this HLA-A/B haplotype is most associated with aggravation.
In the weak HLA-A/B haplotype, Italy had an outstanding high of 15.38%. Italy announced all deaths in the country during the two months of March and April 2020 to the average death toll of the same period over the past five years at around 47,000, and 19,000 uncounted deaths may be caused by COVID-19. In this way, the haplotype with weak HLA-A/B affects the severity rate more strongly than weak HLA-A alone.
Environmental factors, such as medical environment, living environment, and lifestyle, and political measures against infection also increase the number of infected people. However, although these factors may increase the number of deaths by increasing the number of infected people, they are not the factors that increase the severity rate in infection cases. The World Health Organization (WHO) has pointed out that individual factors that raise the severity rate include 65 years and older, living in a nursing home or longterm care facility, chronic lung disease, moderate to severe asthma, serious heart conditions, immunocompromised people, severe obesity, diabetes, chronic kidney disease undergoing dialysis, and liver disease. Most of these are due to increased ACEII receptor expression and decreased innate or acquired immunity.
A significant correlation between weak HLA class I and the number of deaths per million population among many risk factors of severity indicates that HLA class I is a major factor in the severity of COVID-19. Although it takes several days for CTL acquisition immunity, SARS-CoV-2 shares many epitopes with other coronaviruses, and, theoretically, there is cross-protective immunity [3]. For this reason, HLA class I acts as the acquired immunity of CTL from an early stage and would suppress the onset of infection, which leads to a decrease in the infection rate. In this study, both the weak HLA-A frequency with the infection rate and the infection rate with the severity rate showed a positive linear correlation. As a result, the weak HLA class I frequency correlated an exponential approximation increase with the severity rate.

Limitations
The limitations of this study are: 1) Data that only look at infection frequency and mortality until May 14, 2020, which is relatively early in the COVID-19 epidemic; 2) No statistical consideration given as to the effects of national economic, social, cultural, environmental factors, and medical level on the infection rate and severity in each country; 3) In a multi-ethnic country, only the representative racial group is examined, and racial weight is not considered; 4) No consideration given to biological risk factors other than HLA; 5) The biological mechanism of HLA against SARS-CoV-2 virus is not yet clear fully.
Since the data are from the early epidemic before the treatment method for COVID-19 was established, the effect of the treatment method is considered to be small. The other limitations are very complex and difficult to consider for each role that affects the disease. Regardless of these limitations, the statistical significance of HLA class I susceptible alleles against COVID-19 in this study is that HLA is a major factor for the infection and severity rate of the disease.

Conclusions
When the number of binding repertoires in the HLA-A allele is reduced, it is considered that the cytotoxic T cell activity is also reduced. Weak HLA-A alleles with a low number of binding repertoires of SARS-CoV-2 peptides had an exponential correlation with the severity rate. The reason for the exponential approximation was that the weak HLA-A frequency showed a positive linear correlation with the infection rate as well as the infection rate with the severity rate. Weak alleles with a low number of binding repertoires in HLA-B alone do not correlate with the severity rate at all, but by forming a haplotype with weak HLA-A, the severity rate is stronger than that of weak HLA-A alone.

Additional Information Disclosures
Human subjects: All authors have confirmed that this study did not involve human participants or tissue. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.