Name | Genome | Consortium | Super Population | Population Description | Population Origin | Case Population Size | Control Population Size | Comorbidities | Mean Median Age | Sex | Severity | Sample Source | Method | Bioinformatics | Imputation Details | Limitations |
Asano | GRCh37 | COVID Human Genetic Effort (HGE) | EUR, AFR, AMR, SAS, | Individuals from seven countries | Not specified | 1202 | 331 |
Not sepcified | 52.9 | all male | Critical | Nasopharyngeal and Clinical characteristics | Genomic DNA extracted for WGS and WES. For WES, libraries were generated with the Twist Human Core Exome Kit, the xGen Exome Research Panel IDT xGen, the Agilent SureSelect V7 kit or the SeqCap EZ MedExome kit from Roche, and the Nextera Flex for Enrichment-Exome kit (Illumina). Massively parallel sequencing was performed on the Illumina HiSeq4000 or NovaSeq6000 system. For WES analysis performed at CNAG Barcelona, Spain, capture was performed with the SeqCap EZ Human Exome Kit v3.0 (Roche Nimbl | GATK best-practice pipeline was used to analyze WES data. Reads were aligned to hg19 with the maximum exact matches algorithm in the BWA. PCR duplicates were removed with Picard tools. The GATK base quality score recalibrator was applied to correct sequencing artifacts. Sample genotypes with a coverage < 8X, a genotype quality (GQ) < 20, or a minor read ratio (MRR) < 20% was filtered out. We filtered out variant sites (i) with a call rate <50% in gnomAD genomes and exomes, (ii) a non-PASS filter in the gnomAD database, (iii) falling in low-complexity or decoy regions, (iv) that were multi-allelic with more than four alleles, (v) with more than 20% missing genotypes in our cohort, and (vi) spanning more than 20 nucleotides. Variant effects were predicted with VEP and the Ensembl GRCh37.75 reference database. An enrichment analysis focusing on X chromosome genes without known inborn errors of TLR3- and IRF7-dependent type I IFN immunity and without neutralizing auto-Abs against type I IF | Not specified | Not specified |
Asselta | GRCh37 | None Specified | EUR | Italian | Southern Europe | 7268 | 84450 |
Not specified | Not Specified | Not specified | NA | NA | Whole exome sequencing (WES) and genome-wide microarray genotyping data analysis | Expression: GTex repository and GEO database
Burden analysis: LRT score, MutationTaster, PolyPhen-2 HumDiv, PolyPhen-2 HumVar, and SIFT | Michigan Imputation server
Reference panel: 1000G Phase 3 v5
Phasing: ShapeIT v2.r790
Software: Minimac3
Filters: r2>0.3 | Analysed genetic data for only two candidate genes
No confounding issues addressed
This was an exploratory study that did not include any genetic data from patients infected with SARS-CoV-2, therefore there is a need for experimental data. |
Baldassarri | not specified | GEN-COVID Multicenter Study | EUR | Cases and controls were drawn from the Italian GEN-COVID cohort of 1178, cases were selected according to the following inclusion criteria: i. CPAP/biPAP ventilation (230 subjects); ii. endotracheal intubation (108 subjects). As controls, 300 subjects were selected using the sole criterion of not requiring hospitalization. Exclusion criteria for both cases and controls were i. SARS-CoV-2 infection not confirmed by PCR; ii. non-caucasian ethnicity.The Spanish cohort, composed of male COVID- 19 pa | Italy, Spain | 1295 | 341 |
cardiac, endocrine, neurological, neoplastic | 52-68 years | predominantly male | Severe | Nasopharyngeal swab | The HUMARA assay was used to establish allele sizes of the polymorphic triplet in the AR locus performed using a fluorescent PCR followed by capillary electrophoresis on an ABI3130 sequencer. Allele size was established using the Genescan Analysis software. WES data had already been obtained from the previous GEN-COVID study. Serum and plasma total Testosterone, SHBG levels in plasma and serum LH were measured following standard procedures | Variants calling was performed according to the GATK4 best practice guidelines, using BWA for mapping, and ANNOVAR for annotating. A LASSO logistic regression was performed on the poly-amino acid repeats. The data pre-processing was coded in Python, whereas for the logistic regression model the scikit-learn module with the liblinear coordinate descent optimization algorithm was used. | Not Specified | Not Specified |
Cantalupo | GRCh38 | Universities, Hospitals and Institutes in Italy | EUR | GWAS data obtained from hospitalised cohort of the COVID-19 HGI. The first replication study was performed on genetic data from the 23andMe study hospitalised cohort. The second replication study was performed on genetic data from a selected Italian hospitalised cohort. WES data was obtained from a hospitalised cohort in Southern Italy. | Italy | 6406 | 902088 |
not specified | not specified | not specified | Severe | NA | Summary statistics from GWAS meta-analysis of severe COVID-19 cases (COVID-19 HGI) were used to analyse the 3p21.31 region previously associated with severe COVID-19. 52 selected SNPs with suggestive statistical significance and that were eQTLs for CCR5 in lung were assessed.The three SNPs with the highest eQTLs p values in lung were investigated further. To validate the association between rs35951367 and a severe form of COVID-19 disease, we use two independent cohorts of cases and controls. Th | Allele and genotype frequencies were obtained from gnomAD ; eQTLs analysis was performed by using public data from Genotype-Tissue Expression (GTEx) Portal. CCR5 gene expression was obtained from NCBI Gene Expression Omnibus (GEO) Database and plotted using R2: Genomics Analysis and Visualization Platform. DUET was used to evaluate effect of missense variants on CCR5 protein. Sequece reads were alligned and mapped using BWA-mem (V0.7.17)and SAMTools (V1.8). Duplicate reads were removed with Picard (V2.18.9). SNV's and small indels were dteected using the GATK HaplotypeCaller with fubctional annotation of variants using ANNOVAR. Off-target variants and SNPs were excluded with allele frequencies greater than 1% in non-Finnish European populations of the 1000 Genomes Project, ExAC (v3) and GnomAD (v2.1.1) databases. To remove possible false positives, variants falling in genomic duplicated regions were also excluded. The set of exonic variants was filtered to remove synonymous SNVs. For t | NA | Not specified |
Chamnanphon | GRCh37 | Universities and Hospitals in Thailand | EAS | Samples were obtained from both biobank samples and cases admitted to King Chulalongkorn University and King Chulalongkorn Memorial Hospital, Bangkok, Thailand between February 2020 to March 2021. Clinical characterostics and data were assessed and cases were classified into four conditions mild, moderate, severe and critical | Thailand | 212 | 36 |
Diabetes, Dyslipidemia, Chronic kidney disease, Cardiovascular disease, lung and liver disease, cancer, immunocompromised system | 44 | 91 males 157 females | Combination | Nasopharyngeal swab | Genomic DNA was analysed with the AxiomTM Human Genotyping SARS-COV-2 Array (Thermo Fisher Scientific) | Genotype calling from intensity data file was performed with Axiom Analysis Suite (AxAS) version 5.1.1 software using default parameters. Quality control (QC) and PCA was carried out following Ricopili pipeline (Lam et al 2020). SNPs were pruned to minimize LD between SNPs with the criteria of R2 < 0.2, and the number of SNPs in the window for pruning was 200 until there were less than 100,000 SNPs. SAIGE was applied for
association analysis in GWAS using a logistic mixed-effects model. LD blocks using LDBlockShow were obtained, and genes residing in the blocks which contained SNPs with statistical significance were acquired
using the University of California, Santa Cruz (UCSC) Genome Browser. | Genotype imputation was done for chromosomes 1–22 using Michigan Imputation Server and the reference panel used was Genome Asia Pilot (GAsP) | The limitation of this study was the relatively small sample size |
Chen | GRCh37 | Vanderbilt University Medical Center biobank | EUR AFR | European and African American ancestry | Not specified | 10599 | 74638 |
Obesity, COPD, diabetes, CVD, liver and renal disease, asthma, dyslipidemia and hypertension | 52.1 EUR 37.2 AFR | 51.1% male EUR, 45.9% male AFR | NA | NA | Illumina Expanded Multi-Ethnic Genotyping Array (MEGAEX)was used for genotyping. Replication was verified using two independent datasets from UK Biobank (UKB) and non overlapping Vanderbilt University biobank (BioUV) data. Top findings in the study were analysed in the COVID-19 Host Genetics Initiative (COVID-19 HGI) meta-analysis summary statistics from the July 2, 2020 release. | SNPs with an imputation info score greater than 0.4 and minor allele frequency (MAF) greater than 1% were used for further GWAS and GReX imputation. More stringent cut offs used for AFR individuals because of smaller sample size. SUGEN was used to remove known family relatedness and PRIMUS used to reconstruct non directional family networks with ERSA verifying families with more then 5 members. LD patterns analysed on HaploView V. 4.2. | Genetic imputation in MEGAEX-genotyped subjects was conducted with minimac4 on the Michigan Imputation Server12 with a reference panel of Haplotype Reference. | Phenotypes were extracted from electronic health records which can affect classification of cases. Pneumonia cases were based on clinical evidence and not lab based testing |
COVID-19-HGI | GRCh38 | COVID-19 Host Genetics Initiative | MID,S/EAS,AFR,AMR,EU | (1) Critically ill COVID-19 cases defined as patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection and who required respiratory support or whose cause of death was associated with COVID-19, (2) the hospitalized COVID-19 group included patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection, and (3) reported infection cases group included individuals with laboratory-confirmed SARS-CoV-2 infection o | Study dependent | 13641 | 2070709 |
not specified-study dependent | 55.3 years mean | not specified-study dependent | Severe | Nasopharyngeal swab / Whole blood | Case-control meta-analyses in three main categories of COVID-19 disease according to predefined and partially overlapping phenotypic criteria. Each individual study that contributed data to a particular analysis met a minimum threshold of 50 cases for statistical robustness. | Each contributing study genotyped the samples and performed quality controls, data imputation and analysis independently, but following consortium recommendations. GWAS analysis was run using Scalable and Accurate Implementation of GEneralized mixed model (SAIGE) 51 on chromosomes 1-22 and X or PLINK. Study-specific summary statistics were then processed for meta-analysis. Potential false positives, inflation, and deflation were examined for each submitted GWAS. Standard error values as a function of effective sample size was used to find studies which deviated from the expected trend. Summary statistics passing this manual quality control were included in the meta-analysis. Variants with allele frequency of >0.1% and imputation INFO>0.6 were carried forward from each study. Variants and alleles were lifted over to genome build GRCh38, if needed, and harmonized to gnomAD 3.0 genomes by finding matching variants by strand flipping or switching ordering of alleles. If multiple matching v | For genotype imputation, participants suggested to use own reference panel, existing imputation panels or use the TopMed imputation server or the Michigan imputation server when possible. | Due to the participation of different studies those enriched with severe cases or studies with antibody-tested controls may disproportionately contribute to genetic discovery despite potentially smaller sample sizes. The differences in genomic profiling technology, imputation, and sample size across the constituent studies can have dramatic impacts on replication and downstream analyses (particularly fine-mapping where differential missing patterns in the reported results can muddy the signal). |
COVID-19-HGI | GRCh38 | COVID-19 Host Genetics Initiative | MID,S/EAS,AFR,AMR,EU | (1) Critically ill COVID-19 cases defined as patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection and who required respiratory support or whose cause of death was associated with COVID-19, (2) the hospitalized COVID-19 group included patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection, and (3) reported infection cases group included individuals with laboratory-confirmed SARS-CoV-2 infection o | Study dependent | 49562 | 1770206 |
not specified- study dependent | 55.3 years mean | not specified-study dependent | Susceptibility | Nasopharyngeal swab / Whole blood | Case-control meta-analyses in three main categories of COVID-19 disease according to predefined and partially overlapping phenotypic criteria. Each individual study that contributed data to a particular analysis met a minimum threshold of 50 cases for statistical robustness. | Each contributing study genotyped the samples and performed quality controls, data imputation and analysis independently, but following consortium recommendations. GWAS analysis was run using Scalable and Accurate Implementation of GEneralized mixed model (SAIGE) 51 on chromosomes 1-22 and X or PLINK. Study-specific summary statistics were then processed for meta-analysis. Potential false positives, inflation, and deflation were examined for each submitted GWAS. Standard error values as a function of effective sample size was used to find studies which deviated from the expected trend. Summary statistics passing this manual quality control were included in the meta-analysis. Variants with allele frequency of >0.1% and imputation INFO>0.6 were carried forward from each study. Variants and alleles were lifted over to genome build GRCh38, if needed, and harmonized to gnomAD 3.0 genomes by finding matching variants by strand flipping or switching ordering of alleles. If multiple matching v | For genotype imputation, participants suggested to use own reference panel, existing imputation panels or use the TopMed imputation server or the Michigan imputation server when possible. | Due to the participation of different studies those enriched with severe cases or studies with antibody-tested controls may disproportionately contribute to genetic discovery despite potentially smaller sample sizes. The differences in genomic profiling technology, imputation, and sample size across the constituent studies can have dramatic impacts on replication and downstream analyses (particularly fine-mapping where differential missing patterns in the reported results can muddy the signal). |
David | GRCh38 | Genetics Of Mortality In Critical Care (GenOMICC) and the International Severe Acute Respiratory Infection Consortium (ISARIC) Coronavirus Clinical Characterisation Consortium (4C) (ISARIC 4C) | EUR | Cohort participants were critically ill, hospitalized COVID-19 positive patients from 208 UK ICUs: 2109, patients were recruited as part of the GenOMICC project, and an additional 135 cases as part of ISARIC 4C study. Participants of mixed-ancestry were excluded. Ancestry-matched controls without a positive COVID-19 test were obtained from the UK BioBank population study and validation, 45,875 unrelated individuals of European ancestry from the 100,000 Genomes Project were used as an alternative | United Kingdom | 2244 | 11220 |
19% in GenOMICC and 30% in ISARIC | 57.3 +- 12.1 in GenOMICC, 57.3 +- 2.9 in ISARIC | 30% female GenOMICC, 34% female ISARIC | Critical | Nasopharyngeal swab | DNA extraction, genotyping and quality control was previously described in Pairo-Castineira et al 2020 as part of the GenOMICC consortium.TMPRSS2 variants that are predicted to be loss of function, missense or inframe and indel in the database of population genetic variations GnomAD were extracted and evaluated. The rs12329760 is predicted damaging with a MAF of 0.25 in the human population and the relation between the variant and the critical COVID-19 cohort was assessed. The association was re | The association between the TMPRSS2 rs12329760 variant and COVID-19 severity was assessed using logistic regression. Logistic regression with additive and recessive models was performed in PLINKv1.9, adjusting for sex, age, mean-centred age-squared, top 10 principal components (PCA performed to adjust for population stratification) and deprivation index decile based on UK postcode. Genetic ancestry was inferred using
ADMIXTURE and reference individuals from the 1000 Genomes project. Each major ancestry group alternative in the 100,000 Genomes control group was performed with mixed model association tests in SAIGE (v0.39), including age, sex, age-squared, age-sex interaction and the first 20 principal components as covariates. Trans-ethnic metaanalysis of GenOMICC data for different ancestries was performed by METAL using an inverse-variance weighted method and the P-value for heterogeneity was calculated with Cochran’s Q-test for heterogeneity implemented in the same software. Meta-an | Imputation was performed using the TOPMed reference panel | a. The lack of access to a cohort of asymptomatic/pauci symptomatic COVID-19 patients, meant for comparison the general population was used to compare to COVID-19 severe cases. |
Ellinghaus | GRCh38 | The Severe COVID-19 GWAS group | EUR | Italian and Spanish patients with severe disease defined by respiratory support and hospitalization | Southern Europe | 1610 | 2205 |
49% hypertension, 18.1% diabetes, 10.4% coronary artery disease | 66.7 | male 1096 female 514 | Severe | Nasopharyngeal swab | Genotyping by Illumina Global Screening Array | PLINK 1.9 logistic-regression for imputation uncertainty. PCA was used for GWA tests with adjustments for population stratification, age and sex. A fixed effects meta-analysis was conducted using METAL. Bayesian fine-mapping performed for loci reaching GW significance. | SNP imputation using Michigan Imputation Server and 194,512 haplotypes generated by Trans-Omics for Precision Medicine (TOPMed) | Limited genotype-phenotype elaboration and therefore adjustment for potential sources of bias; limited information on SARS-CoV-2 infection status in control group; exclusion of genotype samples based on ethnicity. |
Fadista | GRCh38 | Not specified | EUR | COVID-19 HGI A2 cohort | Not specified | 4336 | 623902 |
None specified | Not specified | Not specified | Severe | Nasopharyngeal swab / Whole blood | Mendelian randomization (MR) study for IPF causality in COVID-19. Genetic variants associated with IPF susceptibility from previous GWAS were used as instrumental variables on COVID-19 severity from the GWAS meta-analysis by the COVID-19 HGI | Two-sample MR analysis was performed using the random-effects inverse-variance weighted method implemented in the R (version 3.6.1) package MendelianRandomization
(version 0.5.0). | An allele frequency of 0.001 and an imputation info score of 0.6 was applied to each study before meta-analysis according to COVID-19 HGI protocol | 1) Variance explained by the use of non-MUC5B IPF genetic instruments, although within the range typical of complex traits
2) Selection bias may play a rold in the protective effect found from rs35705950 as (a) a patient group that is heavily enriched for the rs35705950 T undertaking strict self-isolation and/or (b) due to survival bias of the rs35705950 non-IPF
risk allele carriers.
2) Increased sample sizes, both from the IPF or COVID-19 GWAS could also have narrowed the confidence interval |
Fallerini | GRCh38 | GEN-COVID Multicenter Study | EUR | A subset of male COVID-19 patients was selected from the Italian GEN-COVID cohort of 1,178 SARS-CoV-2-infected participants (Daga et al., 2021). Cases were selected according to the following inclusion criteria: i. male gender; ii. young age (<60 years); iii endotracheal intubation or CPAP/biPAP ventilation. As controls, participants were selected using the sole criterion of being oligo-asymptomatic not requiring hospitalization. Cases and controls represented the extreme phenotypic presentation | Italy | 79 | 77 |
not specified | <60 years | Male only | Severe | TBD | A nested case control study was performed on a subset of male participants with extreme COVID-19 phenotypes. A LASSO logistic regression analysis assessing only rare variants were considered in a boolean representation which discovered the TLR7 gene as important. By selecting for young males, rare (MAF < 1%) TLR7 missense variants predicted to impact on protein function (CADD scores) were discovered and in none of the SARS-CoV2 infected oligo-asymptomatic male participants. In order to functio | The principal components analysis (PCA) was applied prior to the LASSO logistic regression in order to remove samples that were clear outliers. A 10-fold cross-validation method was applied in order to test the performances of the LASSO logistic regression. To determine the significance of the association between TLR7 variants and COVID severity, the Fisher’s Exact Test was used. p Values < 0.05 were considered statistically significant. | NA | Not specified |
Freitas | GRCh38 | Universities, Hospitals and Institutes in Portugal | EUR | COVID-19 lab confirmed positive individuals from two hospitals (Santa Maria and Sao Joao hospital) | Portugal | 491 | 0 |
Hypertension (63.1%), diabetes (31.8%) and obesity (23.4%), Vitamin D deficient (63.3%), Vitamin D insufficient (24.4%), disclosure of other diseases included in supplementary data | 69.7 ±15.8 | 217 female; 266 male | Severe | Nasopharyngeal swab | Several SNPs from previous GWAS studies (European individuals) that play an important biological role in vitamin D metabolism, transport, degradation, and downstream pathways have been identified to affect Vitamin D levels. To understand if an association exists between the polymorphisms in the vitamin D-related genes and the disease severity, four polygenic risk scores (PRSs) were defined. iPLEX® MassARRAY® system used to assess genotype of patients. | A polygeneic risk score was estimated for each individual according to their genotype profiles. Statistical tests (Mann–Whitney, Kruskal–Wallis and Spearman rank correlation coefficient) were used to find differences in genetic variants in vitamin D-related genes between COVID-19 patients with different degrees of disease severity. | No imputation performed | Only hospitalized cases assessed |
Gomez | not specified | Hospitals, Universities and Institutes in Spain | EUR | Cohort requiring hospitalisation with 67 patients in need of critical care support, including high-flow oxygen, positive-pressure ventilation (either invasive or non-invasive) or vasoactive drugs. | Asturias, Northern Spain | 204 | 536 |
Hypertension (48%), diabetes (18%), hypercholesterolaemia (34%) | 64.77 | 61% male | Severe | Nasopharyngeal swab | The I/D polymorphism (rs4646994) in intron 16 of the ACE gene was genotyped by PCR followed by agarose gel electrophoresis to visualise the two alleles. For the ACE2 rs2285666 A/G SNP the PCR fragments were digested with the restriction enzyme AluI and electrophoresis on agarose gels. The ACE2 coding exons of 60 male patients (30 severe and 30 nonsevere) were amplified with primers designated from exon flanking introns. These fragments were sequenced with Sanger BigDye chemistry in a capillary A | The statistical analysis was performed with the R-project free software (www.rproject.org). The logistic regression (linear generalized model, LGM) was used to compare mean values and frequencies between the groups. | NA | a. Small sample size and low number of severe female cases limited the statistical interpretation of the significant and non-significant associations, b. Individuals exposed but asymptomatic were not studied |
Horowitz | GRCh38 | AncestryDNA: Ancestry, UK Biobank: UKB, Geisinger Health System: GHS, Penn Medicine BioBank: PMBB | EUR, AFR, AMR, SAS | Four ancestries (Admixed American, African, European and South Asian) defined as two groups of COVID-19 outcomes: five phenotypes related to disease risk and two phenotypes related to disease severity among COVID-19 cases | Unspecified | 5461 | 661632 |
19% hospitalized
7% severe disease.
Comorbidities included hypertension, Cardiovascular disease, type 2 diabetes, chronic kidney disease, asthma, COPD | Ancestry 52.49, UKB 56.3, GHS 58.2, PMBB 55.65 | >50% Female | Severe | Nasopharyngeal swab / Whole blood | Ancestry: Illumina genotyping array
UKB: Applied Biosystems UK BiLEVE Axiom Array and UK Biobank axiome array
PMBB: Illumina Global screening array
GHS: Illumina OmniExpress Exome or Global Screening Array | Replicated eight independent associations (r2<0.05) previously reported.
Association analyses- genomewide Firth logistic regression test implemented in REGENIE.
Results meta-analyzed across studies and ancestries using an inverse variance-weighed fixed-effects meta-analysis. | Ancestry: Haplotype Reference Consortium (HRC) reference panel. Haplotypes: Eagle version 2.4.1
Software: Minimac4 version 1.0.1.
UKB: HRC panel, UK10K and 1000 Genomes Project phase 3 panels.
PMBB: TOPMed reference panel and TOPMed Imputation Server.
GHS: TOPMed reference panel and TOPMed Imputation Server. | Most participants from EUR ancestry |
Horowitz | GRCh38 | Ancestry, UKB, GHS, PMBB | EUR, AFR, AMR, SAS | Four ancestries (Admixed American, African, European and South Asian) defined as two groups of COVID-19 outcomes: five phenotypes related to disease risk and two phenotypes related to disease severity among COVID-19 cases | Unspecified | 11356 | 651047 |
Comorbidities included hypertension, Cardiovascular disease, type 2 diabetes, chronic kidney disease, asthma, COPD. | Ancestry (52.49), UKB (56.3), GHS (58.2), PMBB (55.65) | >50% Female | Susceptibility | Nasopharyngeal swab / Whole blood | Ancestry: Illumina genotyping array
UKB: Applied Biosystems UK BiLEVE Axiom Array and UK Biobank axiome array
PMBB: Illumina Global screening array GHS: Illumina OmniExpress Exome or Global Screening Array | Replicated eight independent associations (r2<0.05) previously reported. Association analyses- genomewide Firth logistic regression test implemented in REGENIE. Results meta-analyzed across studies and ancestries using an inverse variance-weighed fixed-effects meta-analysis. | Ancestry: Haplotype Reference Consortium (HRC) reference panel. Haplotypes: Eagle version 2.4.1
Software: Minimac4 version 1.0.1.
UKB: HRC panel, UK10K and 1000 Genomes Project phase 3 panels.
PMBB: TOPMed reference panel and TOPMed Imputation Server.
GHS: TOPMed reference panel and TOPMed Imputation Server. | Most participants from EUR ancestry |
Horowitz_2 | GRCh38 | Regeneron | EUR, AFR, SAS, EAS, | Cases were obtained from four studies (Geisinger Health System (GHS), Penn Medicine BioBank (PMBB), UK Biobank (UKB) and AncestryDNA) from five super population groups and grouped into five case-control comparisons related to the risk of infection and two others related to disease severity among cases with COVID-19. | Study dependent | 52630 | 704016 |
not specified | not specified | not specified | Combination | NA | Both common (minor allele frequency (MAF) > 0.5%, up to 13 million) and rare (MAF < 0.5%, up to 76 million) variants across the seven risk and severity phenotypes were considered | Ancestry-specific GWAS was performed in each study using the genome-wide Firth logistic regression test implemented in REGENIE V2.0.1. Firth’s approach is applied when the P value from the standard logistic regression score test is below 0.05. Directly genotyped variants with an MAF > 1%, <10% missingness, Hardy–Weinberg equilibrium test P > 1 × 10?15 and LD pruning (1,000 variant windows, 100 variant sliding windows and r2 < 0.9) were included. Covariates for age, age2, sex, age-by-sex and the first 10 ancestry-informative PC were also included. Results were subsequently meta-analyzed across studies and ancestries using an inverse variance-weighted fixed-effects meta-analysis. | AncestryDNA used the Haplotype Reference Consortium (HRC) reference panel and performed imputation with Minimac4 v.1.0.1. GHS and PMBB used the TOPMed reference panel using the TOPMed Imputation Server. | 1. Greater power to identify associations with disease risk than with severity outcomes due to relatively small sample size for the latter, 2. Phenotypic heterogeneity among cases with COVID-19 and controls and associated risk factors due to four seperate studies with different collection variables. Ancestry DNA composed of more healthier individuals with milder COVID-19 compared to UKB, GHS and PMBB studies which collected in a clinical setting so were enriched for more severe cases. |
Hu | GRCh37 | Not specified | EUR | COVID-19 positive cases including 292 deaths of participants from the UK Biobank (UKB) | Northern Europe | 1096 | 0 |
Not specified | Not specified | Not specified | Severe | Nasopharyngeal swab | UK Biobank Axiom Array. Imputed SNPs used to perform a GWAS using super variants in statistical genetics to identify potential risk loci contributing to the COVID-19 mortality. | Local ranking and aggregation used to identify super variants using a four step method which included two modes of transmission (recessive and dominant). This method was used in a discovery and validation identification. Logistic regression, replicated 10x for stability, was then used to investigate super variant associations with the death outcomes of COVID-19. Cox regression was used to futher validate supervariants verified in multiple runs. | Haplotype Reference Consortium and UK10K and 1000 Genomes reference panels | Role of super variants in COVID-19 susceptibility not validated. Comorbidities were not accounted for in the association analysis eventhough UKB COVID-19 hospitalised patients had comorbidities. Study was restricted by sample size. Environmetal influencers not factored. No other ethnicities were included in this study. |
Hubacek | not specified | Institutes and Universities in the Czech Republic | EUR | 246 symptomatic (without requiring hospitalisation and 164 asymptomatic COVID-19 positive cases during the first wave (app. March 2020 – June 2020) of the disease in the Czech Republic. All cases completely recovered with no adverse events. | Czech Republic | 410 | 2559 |
7.8% diabetes, 13.3% hypertensive | 44 ± 15 | 54.7% female | Mild | TBD | DNA was isolated from EDTA-treated blood. The ACE I/D polymorphism was genotyped and PCR products of ~ 490 bp and ~ 200 bp characterise the I and D alleles, respectively. All D/D subjects were re-genotyped with ACE I-specific oligonucleotides to avoid the misgenotyping of some I/D heterozygotes as D/D homozygotes. | Statistical analysis was performed using the www.socscistatistics.com tools | NA | The ACE I/D polymorphism effect on COVID-19 outcomes have varied outcomes from different papers (Gomez et al 2020 showed the D/D genotype affecting COVID-19 severity outcomes) |
Jelinek | GRCh37 | Universities and hospitals in the UAE | SAS, EAS, AFR, EUR, | Patients with COVID-19 were recruited from multiple recruitment sites across the UAE. Only patients who tested positive for SARS-CoV-2 by RT-PCR were included. The participants were divided into two groups based on the severity of COVID-19, indicated as noncritical (n = 453) or critical (n = 193). Participants were defined as critical COVID-19 cases, if they are admitted to the ICU with the use of oxygen supplementation or mechanical ventilation. Region of origin of participants included Middle | Abu Dhabi | 193 | 543 |
Comorbidities were defined as a Yes/No for previous medical diagnosis of diabetes mellitus, hypertension, cardiac disease, lung disease, liver disease, kidney disease, metabolic disorder, and/or an autoimmune
disease | 1-85 | 138 female, 508 male | Critical | Whole blood | Genotyping was performed using the Infinium Global Screening Array (Illumina Incorporation, San Diego, California, USA) | QC on the data was performed using the PLINK software (version 1.07) to exclude SNPs with a low minor allele frequency (<0.01), low genotyping rate (<95%), and deviation from Hardy–Weinberg equilibrium (p < 10?4) significance level. A total of 240 SNPs in the ABO gene were extracted for the association of this study for candidate gene analyses. Statistical analysis was performed using PLINK software (version 1.9), R software (version 3.4), and SPSS software (version 16.0). Bivariate and multivariate logistic regression analyses were used to estimate OR and p-values of the association between blood type and COVID-19 severity phenotypes. Two candidate gene association tests were conducted that included unadjusted analysis and adjustment on the top ten eigenvectors for population stratification, age, and gender. Significance level adopted for all the analyses was p ? 0.05. | Genotype data phased and imputed using the Phase 3 1000 Genomes Projects panel | a. Small sample size, b. Selection bias based on presentation to hospital and multiple collection sites, c. Substantial genetic admixture however population stratification was taken into account, d. GWAS array based on Caucasian population used |
Kuo | Not specified | Not specified | EUR | England | Northern and Western Europe | 622 | 322948 |
APOE e4e4 risk allele
Dementia (0.22%);Hypertension (32.42%);Coronary artery disease (8.7%);Type 2 diabetes (5.36%) | 68 ± 8 | 55% (176951) female | Severe | Nasopharyngeal swab | UK Biobank participants, genotyped for ApoE (UK Biobank axiom array) with a positive COVID-19 result were compared to participants who did not test positive over the period 16 March - 26 April 2020. | A logistic regression model was used to compare e4e4 genotypes
to e3e3 for COVID-19 positivity status, adjusted for sex and age.
Genotyping array type and the top five genetic principal components (accounting for possible population admixture). | Not specified | Letter to the editor with little detail on methods and population specifics.
Authors focused on two variants only. |
Latini | GRCh38 | Not specified | EUR | 131 Italian COVID-19 positive patients. | Southern Europe | 131 | 5341 |
not specified | Median age was 63.7 | 82 males, 49 females | TBD | Whole blood | WES using Twist Human Core Exome Kit and sequenced on the Illumina NovaSeq 6000 platform. Four genes (TMPRSS2, PCSK3, DPP4, and BSG) were analysed and allelic frequency compared to the EUR GnomAD reference population. | Illumina BaseSpace pipeline and TGex software used for the variant calling and annotation (30x coverage).
Variants were examined for coverage and Qscore using the Integrative Genome Viewer.
Protein effect prediction: PolyPhen2, Mutation Taster, SIFT, MetaLR_pred, MetaSVM_pred. | not specified | Require larger independent cohorts as well as functional studies to evaluate the effect of the detected genetic variant.
The number of subjects is too small
to stratify them on the basis of clinical characteristics and clinical phenotypes. |
Li | GRCh37 | Universities, Hospitals and the CDC in China | EAS | Severe or critical COVID-19 cases and mild or moderate control cases were obtained from two cohorts (Huoshenshan and Union hospitals in Wuhan, China). For validation of allele frequencies of the significantly associated SNPs 954 COVID-19 unknown controls of Chinese Ancestry and 2504 Chinese Ancestry individuals from the 1000 genomes project (Phase 3, November 2014) were also included. | China | 885 | 546 |
Hypertension, diabetes, coronary artery diseases, chronic hepatitis B, chronic obstructive pulmonary disease, chronic renal diseases, and cancer (statistics not specified) | not specified | not specified | Severe | Nasopharyngeal swab | Genomic DNA was extracted from peripheral whole blood using the QIAamp DNA blood kits (Qiagen). Quality of the isolated genomic DNA was verified using two methods: (1) DNA degradation and contamination on 1% agarose gels; and (2) DNA concentration measured using a Qubit DNA Assay Kit and a Qubit 2.0 Fluorometer (Life Technologies, MA, USA). The Affymetrix Axiom® World Arrays was used for genotyping. | Genotype callings were performed using Axiom Analysis Suite. SNPs were excluded from further analyses if they were not in chr 1–23 or X, had call rates <90% among all subjects in this study, deviated significantly from HWE among all subjects in this study or had minor allele frequencies (MAF) < 1%. A total of 558,642 SNPs were finally retained. To identify the ancestry outliers, PCA was performed using EIGENSOFT (v3). Autosomal SNPs were used for PCA based on the following criteria: call rate >90%, HW P>0.0001, MAF >10% and LD-pruned r2<0.10. Twenty principal components were estimated for all the cases and controls. PCA analyses was performed on the same SNPs for samples from the 1000 Genomes Project and the COVID-19 unknown controls. GWA tests were performed on SNPTEST software using logistic regression models adjusting for covariates and the top five PCAs. eQTL analysis to determine the causative genes were performed using the QTLbase, GTEx v8, Immunopop QTL browser and two independe | Imputation was performed using SHAPEIT (v2) and IMPUTE (v4). A prephasing strategy for the Huoshenshan and Union cohorts was performed by SHAPEIT, using the 1K Project data (Phase 3, November, 2014) as the reference based on hg19. IMPUTE4 was used to impute the phased haplotypes constructed by SHAPEIT. For the imputation of chr X, males were coded as diploid in non-pseudoautosomal regions. SNPs with IMPUTE4 info scores below 0.6 or MAF < 0.01 were excluded. A separate independent imputation anal | Further functional studies required for the roles of the two loci in COVID-19 pathogenesis |
Lu | GRCh37 | Not specified | EUR, SAS, AFR | UK Biobank COVID-19 positive participants, 180 with COVID-19 related mortality | Not specified | 180 | 1141 |
Hypertension, Diabetes, Cholesterol | Not specified | Male 127 Female 53 | Critical | Nasopharyngeal swab | GWAS combined with functional ontology and deleteriousness statistics | PLINK 2.0 quality control. GWAS with Holm-Bonferroni multiple hypothesis
correction performed. Two gene lists were created using either a UniProt database list or an in house created list. The in house list was created using Hidden Markov models (HMM) and protein coding to leverage functional gene ontology with the FATHMM program analysing amino acid deleteriousness and phenotype prediction. To increase power of association the GWAS p-values and the HMM p-values were combined using Fishers method. | Not specified | Early study with a low amount of data |
Medetalibeyoglu | not specified | University of Istanbul | SAS | The patients were included between April–June 2020 that admitted to the COVID-19 center of a single university hospital with a number of criteria used to determine severe vs mild cases. | Istanbul, Turkey | 284 | 100 |
Hypertension 72 (24.5%), Diabetes mellitus 36 (12.7%), Chronic obstructive pulmonary disease 30 (10.6%), Coronary artery disease 17 (6.0%), Congestive heart failure 5 (1.8%), Solid malignancy 26 (9.2%), Hematological malignancy 7 (2.5%) | 49 | 45.1% Female | Severe | Whole blood | DNA was extracted from the collected blood by using the Genemark isolation kit (Genemark, USA). PCR was used to amplify the MBL2 gene and a restriction enzyme BanI was used to identify the codon 54 polymorphism. All of the patients and controls were examined for the codon 54 A/B (gly54asp: rs1800450) variation in exon 1 of the MBL2 gene. | IBM SPSS version 21.0 was used for statistical analysis. Multivariate binary logistic regression analyses to find out the association between different genetic variants of the MBL2 gene with study parameters. The
results were adjusted for age and sex. Statistical significance was accepted as p < 0.05 for the results of all analyses. | NA | a. Small population size, statistical inference is limited |
Monticelli | GRCh38 | GEN-COVID Multicentre Study | EUR | Patients from 40 Italian Hospitals, 16 Continuity Assistance Special Unit (USCA) and 8 Departments of Preventive Medicine led by Professor Alessandra Renieri of the University of Siena in Italy | South-Central Europe | 1177 | 0 |
Diabetes, and/or hypertension, and/or obesity | not specified | not specified | Severe | NA | Whole-Exome Sequencing (WES) data derived from the GEN-COVID Multicenter Study were analysed. | All calculations were performed using R. The odds ratio (OR) function from the Epitools package was used to determine odds ratios.
Pymol was used to visualise protein models.
PolyPhen-2 was used to determine missense variant effect prediction. | NA | No controls were used. Participant information not detailed. |
Namkoong | GRCh37 | Japan COVID-19 Task Force | EAS | Japanese COVID-19 hospitalised patients consisting of 990 critical individuals requiring artificial respiratory support or intensive care. 1391 with non severe disease and 12 with unknown effect. Severe cases were older in age and predominantly male gender. Controls included the general Japanese population | Japan | 2393 | 3289 |
Other than age >65, no comorbidities were described for the case or control populations. | 56.0 ± 18.9 | 64.2% male, 35.8% female | Severe | Nasopharyngeal swab / Whole blood | Illumina Infinium Asian Screening Array used for cases and controls and SNP results used for the GWAS analysis. | PCA analysis was applied to remove outliers and non Japanese individuals. GWAS conducted using logistic regression for each variant using PLINK2 software. Meta-analysis of the Japanese discovery GWAS and the pan-ancestry analysis was conducted using an inverse-variance method assuming a fixed-effects model. Logistic regression and R statistical software used for ABO blood group analysis. | Minimac4 software used for imputation. A Japanese population-specific imputation reference panel combined with 1000 Genomes Project Phase used. QC filters of MAF ? 0.1% and imputation score >0.5 applied. Where lead variants were obtained by imputation, accuracy was assessed using WGS data. HLA genotype imputation was performed using DEEP*HLA software (version 1.0). | The reference panels are used to impute the data do not seem to be uploaded to a repository, which would be crucial for other studies looking to include East Asian genetic diversity into panels for populations with high levels of admixture including people of East Asian origin.
The article has not yet been peer-reviewed as of entry into this database. |
Pairo-Castineira | GRCh37 | GenOMICC | EUR | critically ill patients with COVID-19 in the UK population | Not specified | 1676 | 8375 |
28% significant co-morbidity | 57.3 ± 12.1 | 70% male; 30% female | Critical | Nasopharyngeal swab / Whole blood | Genotyping with Illumina Global Screening Array v.3.0. In some cases genotypes and imputed variants were confirmed with Illumina NovaSeq 6000 WGS. Variants were validated using a GWAS of genetic studies with 100 000 genomes and Generation Scotland datasets. | DRAGEN pipeline used for variant calling. Variants were genotyped with the GATK GenotypeGVCFs tool v.4.1.8.150 and annotated with bcftools v.1.10.2. PLINK 1.9 was used for quality control and association tests. King 2.1 used to remove duplicate individuals with gcta 1.9 used for PCA. Genetic ancestry was inferred using PCA. | TOPMed reference panel with BCFtools 1.9. and QCtools 1.3 used for quality control | The critical sample size was small. Thus, replication of results was sought in the HGI hospitalised COVID-19 vs population analysis, with duplicated samples excluded. |
Pairo-Castineira | GRCh37 | COVID-HGI and 23andMe Inc | EUR | European ancestry hospitalised patients from COVID-19 HGI and those with a broad respiratory phenotype from 23andMe Inc. | Not specified | 3543 | 1157272 |
Not specified | not specified | not specified | Severe | Nasopharyngeal swab / Whole blood | Replication analysis was performed using the loci in GenOMICC individuals.
Replication analyses were performed using HGI build 37, version 2 ( July
2020) B2 (hospitalized patients with COVID-19 versus the population) GWAS and those defined with respiratory phenotypes in 23andMe Inc. Summary statistics were used from the full analysis, including all cohorts and GWAS without UK Biobank, to avoid sample overlap. | Meta-analysis of the GenOMICC, HGI and 23andMe datasets was performed
using fixed-effect inverse-variance meta-analysis in METAL. | Not specified | Effect sizes are likely to be greater in the GenOMICC study because the cohort is strongly enriched for immediately life-threatening disease.
Further studies needed to narrow down the loci found. |
Pehlivan | not specified | University of Istanbul | SAS | Patients diagnosed with COVID-19 between April and June 2020 who were admitted to the COVID-19 center of a university hospital, and 100 healthy individuals without any known were included as controls. Healthy controls consisted of individuals who were negative for Sars-CoV-2 antibody (Sars-Cov-2 IgM, IgG) and were negative in two PCR results taken with an interval of 48 hours. Patients included those that were diagnosed with COVID-19 who show false negativity with initial examinations before hos | Istanbul, Turkey | 70 | 100 |
not specified | not specified | not specified | Susceptibility | Nasopharyngeal and Clinical characteristics | PCR and/or RFLP was used to isolate DNA samples from blood leukocytes at the time of diagnosis. MBL2-rs1800450, NOS3-rs1799983 and NOS3-intron 4 VNTR gene polymorphisms were analyzed. | Logistic regression was used to analyse statistical differences analysis between groups. The adjusted odds ratios (ORs) were calculated with a logistic regression model that checked for sex and age and are reported with 95% confidence intervals (CIs). Differences between the patients’ group were compared using the chi-square test and the Fisher exact test when required. A p-value of less than 0.05 was accepted as significant. | NA | a. Limited patient population, b. No MBL serum levels were recorded |
Peloso | GRCh37 | Universities and Hospitals in the United States | EUR, AFR, AMR | White, Black (African American) and Hispanic participants were obtained from the VA Million Veteran Program. Participants formed part of four groups 1) COVID-19 positivity as defined by positive COVID-19 test compared with all other MVP participants (POS vs. POP); 2) individuals who were hospitalized for COVID-19 compared with all other MVP participants, including individuals who tested positive for COVID-19 but were not hospitalized (HOS vs. POP); 3)individuals who were hospitalized for COVID-1 | United States | 19168 | 492854 |
none specified | 62.3 | 92.25 | Combination | Whole blood | Genotyping using a customized Affymetrix Axiom Biobank Array. Single variant association was performed between imputed variants and four participant outcomes. | Population-specific principal components (PCs) were computed using EIGENSOFT v.6. The harmonized race/ethnicity and genetic ancestry (HARE) approach was used to assign individuals to three mutually exclusive groups: 1) non-Hispanic White (White), 2) non- Hispanic Black (Black), and 3) Hispanic or Latino (Hispanic) . Kinship was inferred using KING v.2.0 . For each pair of relatives (kinship coefficient ?0.0884), one individual was excluded, preferentially retaining those who tested positive for SARS-CoV-2. Logistic regression was applied in PLINK v2 adjusting for age, age2, sex, age*sex, and 15 populationspecific PCs analysed within each of the HARE-assigned groups. COVID-19-HGI summary statistics (Release 5) for the multi-population and the White-only meta-analyses, excluding MVP and 23&Me data, were used for replication. The association between ABO blood type and COVID-19 performed using logistic regression adjusted for age and sex in four COVID outcomes. Variants with population-spe | Imputation was performed to a hybrid imputation panel comprised of the African Genome Resources panel and 1000 Genomes (p3v5). | a. Limited power for genome-wide analyses in severity COVID-19 outcomes, b. Predominance of males in sample cohort |
Roberts | GrCH37 | Ancestry DNA | EUR, AMR | European ancestry- (65%)
Admixed African-European ancestry includes 100% African ancestry-(6%)
Admixed Amerindian ancestry also includes 100% Amerindian
ancestry-(11%)
Other including Admixed East Asian-European ancestry also includes 100% East Asian ancestry- (18%) | Northern America | 2417 | 14993 |
Asthma, Cardiovascular disease, diabetes, hypertension, autoimmune disease, rare health conditions (proportions specified in supplementary data) | 56 | 862 male; 1555 female | Susceptibility | Nasopharyngeal swab | GWAS; Illumina genotyping array | Linkage disequilibrum (LD)-pruned using PLINK 1.9
COVID-19 Host Genetics Initiative analysis plan version 1 using PLINK 2.0 for GWAS analysis | Haplotype Reference Consortium (HRC) reference panel
Haplotypes : Eagle version 2.4.1
Software: Minimac4 (MAF)>0.01 R2>0.30 | Participants required to fill out a survey therefore cases that were very severe or fatal would not be included
Participants restricted to European ancestry due to small numbers from other ancestral groups
No independent replication cohort to determine if results are reproducible
Did not assess indels |
Roberts | GRCh37 | Ancestry DNA | EUR, AMR | European ancestry (65%)
Admixed African-European ancestry includes 100% African ancestry (6%)
Admixed Amerindian ancestry also includes 100% Amerindian
ancestry (11%)
Other (18%) including Admixed East Asian-European ancestry also includes 100% East Asian ancestry | Northern America | 250 | 1967 |
Asthma, Cardiovascular disease, diabetes, hypertension, autoimmune disease, rare health conditions (proportions specified in supplementary data) | 56 | 105 male, 145 female | Severe | Nasopharyngeal swab | GWAS (Illumina genotyping array) | Linkage disequilibrum (LD)-pruned using PLINK 1.9
COVID-19 Host Genetics Initiative analysis plan version 1 using PLINK 2.0 for GWAS analysis | Haplotype Reference Consortium (HRC) reference panel
Haplotypes : Eagle version 2.4.1
Software: Minimac4 (MAF)>0.01 R2>0.30 | Participants required to fill out a survey therefore cases that were very severe or fatal would not be included
Participants restricted to European ancestry due to small numbers from other ancestral groups
No independent replication cohort to determine if results are reproducible.
Did not assess indels |
Sayin | GRCh38 | Istanbul Training and Research Hospital, Istanbul University | not specified | 200 patients diagnosed with COVID-19 in pandemic clinics of Istanbul University, Faculty of Medicine, between 1 April and 1 June 2020, as well as 100 volunteers without known comorbidities. Gene polymorphism distribution and statistical significance were examined in the following patient groups: (1) severe/mild infection, (2) exitus/ alive during the 28-day follow-up, (3) presence of the need for intensive care/being only inpatient. | Istanbul, Turkey | 200 | 100 |
26% hypertension | 49 median age | 87 female; 113 male | Severe | Nasopharyngeal swab | DNA isolation from peripheral blood leukocytes of COVID-19 patients and healthy controls was performed using the saline precipitation method. PER3 gene polymorphism genotypes were analyzed by the PCR method. | IBM SPSS version 21.0 (IBM Corp. USA) for all statistical analyses. P<0.05 was considered statistically significant. | NA | When divided into clinical subgroups, the statistical analysis may have been biased due to the decreased number of patients, and no significant results were obtained in terms of the 5R/5R genotype. |
Shelton | GRCh37 | 23andMe COVID-19 team | EUR, AFR, AMR | COVID-19 positive individuals from the 23andMe study consisting of 80.3% EUR, 11.3% Latino and 2.7% African American. Control size depends on phenotype analysed. | 93.2% United States, 2.4% United Kingdom and 4.4% various countries around the world | 12972 | 101268 |
COVID-19 positive test: Type 2 Diabetes- 5.2%, Fatty liver disease-4.8%, Obesity-37.1%, Hypertension-24.3%. COVID-19 positive test + hospitalisation: Type 2 Diabetes-13.3%, Fatty liver disease-9.7%, Obesity-52.6%, Hypertension-42.8%. | 51 | 63% female, 37% male | Susceptibility | Nasopharyngeal swab | Samples genotyped on either the Illumina HumanHap550 BeadChip, Illumina OmniExpress BeadChi or Illumina Global Screening Array each containing customized SNPs or array. GWAS analysis performed on each phenotype and population group. One susceptibility phenotype for COVID-19 positive vs negative participants and 4 severity phenotypes were analysed (pneumonia, hospitalisation, respiratory support with supplemental oxygen and or ventilation). | Case control analysis used logistic regression and P values computed using the likelihood ratio tests. GWAS was performed on each phenotype and population cohort seperately. Trans-ancestry meta-analysis performed with a fixed effects model (inverse variance method). | Haplotype Reference Consortium panel, augmented by the Phase 3 1000 Genomes Project panel for variants not present in the Haplotype Reference Consortium. | Self-reported data from an existing groups which is not reflective of the general population
Scarcity of testing could have obscured the true picture of SARS-CoV-2 infections, misclassification of true infections (bias) |
Shelton | GRCh37 | 23andMe COVID-19 team | EUR, AFR, AMR | COVID-19 positive individuals from the 23andMe study consisting of 80.3% EUR, 11.3% Latino and 2.7% African American. | 93.2% United States, 2.4% United Kingdom and 4.4% various countries around the world | 2083 | 797180 |
COVID-19 positive test: Type 2 Diabetes- 5.2%, Fatty liver disease-4.8%, Obesity-37.1%, Hypertension-24.3%. COVID-19 positive test + hospitalisation: Type 2 Diabetes-13.3%, Fatty liver disease-9.7%, Obesity-52.6%, Hypertension-42.8%. | 51 | 63% female, 37% male | Severe | Nasopharyngeal swab | Samples genotyped on either the Illumina HumanHap550 BeadChip, Illumina OmniExpress BeadChi or Illumina Global Screening
Array each containing customised SNPs or array. GWAS analysis performed on each phenotype and population group. One susceptibility phenotype for COVID-19 positive vs negative participants and 4 severity phenotypes were analysed (pneumonia, hospitalisation, respiratory support with supplemental oxygen and or ventilation). | Case control analysis used logistic regression and P values computed using the likelihood ratio tests. GWAS was performed on each phenotype and population cohort seperately. Trans-ancestry meta-analysis performed with a fixed effects model (inverse variance method). | Haplotype Reference Consortium panel, augmented by the Phase 3 1000 Genomes Project panel for variants not present in the Haplotype Reference Consortium. | Self-reported data from an existing groups which is not reflective of the general population Scarcity of testing could have obscured the true picture of SARS-CoV-2 infections, misclassification of true infections (bias) |
Speletas | not specified | Universities and hospitals in Greece | EUR, EAS | Participants were enrolled in the study from March to October 2020. The ethnicity of patients included 152 Greeks, 62 Turks, 16 Ukrainians, 15 Indonesians, 6 Uzbeks, 3 Moldovans, 3 Americans, 2 Cubans and 1 patient from each of the following countries: Albania, Belarus, Bulgaria, Germany, and Kyrgyzstan. Most patients of non-Greek origin were passengers and crew members on a cruise ferry who became infected with COVID-19 in March 2020. | Greece | 264 | none |
35.6% with one or more comorbidities including obesity, hypertension, chronic heart disease, chronic rspiratory disease, dyslipidemia, diabetes, hypothyrodism, malignancies, live/renal/hematological disease | 42.8 ± 18.4 years | 180 male; 84 female | Severe | TBD | DNA was extracted from peripheral blood, with the detection of MBL2 genetic alterations performed by allele-specific polymerase chain reaction, followed by restriction fragment length polymorphism (PCR-RFLP) analysis | Multivariate analysis was performed in the form of binary logistic regression for all parameters, with a statistical significance of p > 0.2 in univariate analysis. For all the analyses, a 5% significance level was set. Analysis was carried out with SPSS (version 25.0). | NA | a. No association of the known MBL genetic deficiency to COVID-19 severity but only of the B allele (rs1800450) itself. This could indicate that the results may be affected by the small sample size , diverse composition of the cohort analysed, confounding factors such as smoking, diet and oxidative stress levels that affect MBL serum levels |
Upadhyai | GRCh38 | University Institutes in India | EUR | European individuals from the AncestryDNA COVID-19 host genetic study | European ancestry | 1492 | 197 |
Not specified but according to AncestryDNA data (Roberts et al., 2020) | not specified | not specified | Severe | NA | GWAS was performed using the original AncestryDNA COVID-19 genotyping dataset (EGA Accession no. EGAD00010002012) with 675,370 SNVs to identify genetic variants that show significant frequency variation between asymptomatic versus severely infected COVID-19 patients. | "PCA analysis was performed using PLINK v1.9 with only individuals of European ancestry being used for further analysis. For QC, data with high levels of missingness (>20%) were filtered out using PLINK v1.9 with a MAF threshold of 0.01. Standard case-control-based association analyses were performed in PLINK v1.9 with a multiple-testing corrected p-value < 0.001 being considered significant. Significant SNVs (multiple-testing corrected p-value <0.001) were annotated using SNPnexus web-based server for GRCh38. | not specified | a. Limited availability of data from COVID-19 patients, b. For the SARS-CoV-2 infected cohort, patients were categorized into various COVID-19 outcomes based on a self-reported questionnaire which may have affected the likelihood of miscategorization, c. the unavailability of genomic data for COVID19 patients in the ICU or those who may have succumbed might cause some discrepancies in the final outcomes of the study and, d. GWAS findings may not be exclusively ascribed to SARS-CoV-2 infections |
van Blokland | GRCh38 | Various research groups and Universities | EUR, AMR | Individuals from the Generation Scotland, Helix cohort (Helix DNA Discovery project) in the United States, Lifelines COVID-19 cohort (Lifelines population cohort and
the Lifelines NEXT birth cohort in the Northern part of the Netherlands) and the Netherlands twin register (NTR)
D1-Predicted self reported COVID-19, B2-Hospitalised COVID-19, C1-covid vs. lab/self-reported negative, C2- covid vs. population | Northern, Western Europe and USA | 1865 | 29174 |
Not specified | 47.5 | 26% male | Severe | Nasopharyngeal swab / Whole blood | Menni COVID-19 prediction model to find COVID-19 cases. GWAS performed on predicted COVID-19 case-controls. Top 20 SNPs from COVID-19 HGI D1 cohort were replicated in the C1, C2 and B1 analysis to compare predicted COVID-19 and other cohorts. Data generated with various methods from the four cohorts as follows: HumanCytoSNP Infinium (Global Screening Assay (GSA)/GSA MultiEthnic Disease Version); Perlegen-Affymetrix; Affymetrix (6.0/Axiom); Illumina (Human Quad Bead 660/Omni 1M GSA/OmniExpres | Cohort data in SAIGE format was processed in WDL workflows made available at https://github.com/covid19-hg/META_ANALYSIS.
Inverse variance weighting of effects was used to account for strand-differences and allele flips in individual studies. All build 37 statistics were upgraded to 38 build and allele harmonization was performed using gnomAD 3.0 genomes before beginning the meta-analysis. | Generation Scotland- phasing using Shapeit v2.r873 and duohmm and imputation using the HRC.r1-1 panel. Helix-1000 Genomes Phase 3 data for imputation. Lifelines-Haplotype Reference Consortium (HRC) panel v1.1 at the Sanger imputation server. NTR- data was phased using Eagle and then imputed to 1000 Genomes and Topmed using Minimac | 1. Predictive COVID-19 training data might not be fully representative of the whole spectrum of COVID-19, 2. Symptoms may overlap with other diseases, predicted cases may be falsely identified 3. Prevalence of COVID-19 might be different among different populations and cohorts |
van der Made | GRCh37 | Radboud University Medical Center,Nijmegen, Netherlands | EUR, AFR | 2 pairs of brothers with no previous chronic disease presented in hospital with COVID-19 associated respiratory insufficiency. Pair 1 -Caucasian ancestry, required ICU and mechanical ventilation with one death. Pair 2- African ancestry requiring ICU and mechanical ventilation. | North Western Europe | 4 | 0 |
none | 26 years median age | Male | Critical | Nasopharyngeal swab | Rapid whole-exome sequencing was performed. DNA samples were processed using the Human Core Exome Kit and extended RefSeq targets (Twist Biosciences). Librarieswere prepared according to the manufacturers’ protocols. All DNAsampleswere sheared using a Covaris R230 ultrasonicator (Covaris), subsequently followed by 2 × 150– base pair paired-end sequencing on a Novaseq 6000 instrument (Illumina). | Downstream processing was performed using an automated data analysis pipeline that included Burrows- Wheeler Aligner mapping, Genome Analysis Toolkit variant
calling, and custom-made annotation. Exome analysis of the affected brothers families was also performed to check segregation of all rare filtered variants in the respective index patients. | - | The case series precludes drawing conclusions regarding causality between the rare loss-of-function TLR7 variants and the pathogenesis of severe
COVID-19. The functional experiments with IFN-? measurements lacked statistical significance, possibly due to the limited number of replications and controls included in this study. |
van Moorsel | GRCh38 | Universities and hospital divisions in Netherlands, UK and Denmark | EUR | The discovery cohort from the ILD biobank and data registry of the St Antonius Hospital Nieuwegein, the Netherlands, included adult patients hospitalized due to COVID-19 at St Antonius Hospital between March 19, 2020 and May 5, 2020. 83 participants designated as White and 25 as non-White | Netherlands | 108 | 611 |
7% Diabetes, 15% Asthma/COPD, 1% Interstitial lung disease, 1% Pulmonary hypertension | 66 | 69% male | Severe | Nasopharyngeal and Clinical characteristics | DNA was extracted using a Chemagic 360 from whole blood and samples were genotyped for MUC5B rs35705950 with a pre-designed taqman SNP genotyping assay and the QuantStudio R 5 Real-Time PCR system. Validation cohorts included 436 UKB cases and 356799 UKB controls. For replication, summary data from the severe COVID-19 GWAS group was obtained. This included 835 cases and 1255 controls from Italy and 775 cases and 950 controls from Spain. Genotype counts for SNP rs35705950 were obtained from the r | SPSS 24 was used for statistical analysis. Due to ethnic differences in the prevalence of the MUC5B rs35705950 alleles, genetic analyses were stratified by ethnicity and only statistically analyzed in white subjects. Differences between the allele and genotype frequencies were calculated with the Pearson’s goodness-of-fit chi-square test, together with the OR and 95% CI. Binary logistic regression was used to test for MUC5B rs35705950 association and COVID-19 with age and sex as confounding variables. A value of p < 0.05 was considered statistically significant. Metaanalyses were performed using the allele contrast and dominant
model in the web tool META-Genyo. The fixed-effect estimate method, inverse variance was used. | Genetic data from the “v3” release of UKBB was used which contained the full set of Haplotype Reference Consortium (HRC) and 1000 Genomes imputed variants. For the Italian cohort imputation was performed via TOPMed reference panel | a. A limitation of the study is the focus on white European populations. Minor allele frequencies for MUC5B rs35705950 are known to differ between populations, b. Small sample size of the Dutch cohort, yielding a significant result but with a wide confidence interval. |
Vietzen | not specified | Center for Virology, Medical University of Vienna and the Department of Medicine IV, Kaiser Franz Josef Hospital, Vienna, Austria | EUR | SARS-COV-2 positive cases obtained from the Center of Virology, Medical University of Vienna between the 17 February and 17 April 2020. A total of 92/361 (25.5%) patients showed only minor symptoms and stayed in home quarantine (“nonhospitalized”), 190/361 (52.6%) patients were hospitalized with severe COVID-19 symptoms but never required intensive care (“hospitalized non-ICU”), and 79/361 (21.9%) patients were
severely affected and needed intensive care. | Austria | 361 | 260 |
Obesity, hypertension, COPD and CAD | 69 years (median) | 45% female | Severe | Nasopharyngeal swab | DNA extraction was performed using the NucliSens EasyMag extractor (BioMérieux). DNA was eluted in 50 ?l of nuclease-free H2O. HLA-E*0101/0103 genotypes were determined by a Taqman assay and KLRC2wt/del variants were determined by touchdown PCR. As internal controls, genomic DNA obtained from the HeLa, HEK?293T, and K562 (all ATCC, Manassas, VA, USA) were used. Randomly chosen amplicons from all KLRC2 and HLA-E variants were routinely selected, sequenced on a 3130 genetic analyzer (Applied Biosy | The distribution of the patient’s gender, comorbidities, and genetic variants was compared by ?2 test. Patient age was assessed by ANOVA and Dunn post test. Correlation of the genetic variants and comorbidities was assessed using ?2 test. For multivariable analysis, a general main effects loglinear model with genetic variables, gender, and age groups (<60, 60–70, 70–80, >80 years) was used to identify combined genetic variables associated with the risk for severe SARS-CoV-2 infections, who were hospitalized or hospitalized in an ICU. P values <0.05 were considered significant. Statistical analyses were performed using IBM SPSS Statistics 24. | NA | not specified |
Wang | GRCh38 | Universities, Hospitals and Institutes in China | EAS | COVID-19 hospitalised patients from Shenzhen Third People’s Hospital, China. Of the recruited patients, 25 (7.5%), 12 (3.6%), 225 (67.8%), 53 (16.0%), and 17 (5.1%) patients were defined as asymptomatic, mild, moderate, severe, and critically ill, respectively. | Chinese | 332 | 966 |
>50% of patients with one comorbidity, not specified. Severe COVID category had 58.8% of patients with a comorbidity vs. mild at 45.1% | not specified | 135 male; 149 female | Severe | Nasopharyngeal swab | "Deep whole genome sequencing (46x) was used to maximise statistical power due to small sample size with the DNBSEQ platform. Loss of function, rare and common variants were analysed. Loss of function and rare variants were assessed in related individuals and common variants were also analysed in the study cohort. Both single variant and gene-based GWAS werewere performed. Joint-calling of the genetic variants of the unrelated COVID-19 patients (n = 284) and the publicly available Chinese genome | "Variation detection and genotyping performed using GATK joint genotyping framework. Sentieon Genomics software was used to perform genome alignment and variant detection. The analysis pipeline was bulit according to the Broad institute best practices workflows with variant calibration and filtration using GATK and variant prediction with Variant effect predictor software. PLINK and KING were used for kinship analysis with the Genesis R package for PCA used for genotype–phenotype association tests using the default parameters. Genome-wide significance for single variant association test as 5e–8, suggestive significance as 1e–5 and for gene-based association test as 1e–6." | not specified | a. Sample size of 332 is only just sufficient to identify genome-wide significant genetic variants with MAF greater than 0.2 and odds ratio greater than 1.8 given type I error rate 0.05. b. Patients recuited from hospital had limited information from asymptomatic individuals in comparison to the severe cases. |
Wulandari | not specified | Universities based in Indonesia and the UK | EAS | Patients with moderate and severe COVID-19 (n = 62, 65.3%) were hospitalised in Dr Soetomo General Academic Hospital, Surabaya, Indonesia, whilst 33 patients (34.7%) with asymptomatic or mild symptoms were treated in Indrapura KOGABWILHAN II Hospital, Surabaya, Indonesia. | Indonesia | 95 | none |
Diabetes(n=21), CVD (25), Liver disease (13), Kidney disease (5), Lung disease (3) | 44.7 +/- 1.3 | 60 male 35 female | Susceptibility | Nasopharyngeal swab | DNA extraction was performed using the QIAamp® Blood DNA Midi kit and DNA concentrations determined using a microvolume spectrophotometer. The TMPRSS2 polymorphism was detected using a TaqMan SNP genotyping assay. Genotyping was performed RT-PCR with VIC and FAM fluorescent reporters to indicate allelic discrimination. | Statistical analyses were performed using the IBM SPSS Statistics Software ver. 23 (IBM Corp.) or GraphPad Prism ver. 8 (GraphPad Software, LLC). A chi-squared test was used to examine the Hardy–Weinberg equilibriums and to determine the association between categorical variables in the cross-tabulation data. ANOVA with post hoc multiple comparisons was used to analyse numerical data. A P value less than 0.05 was considered to be statistically significant. | NA | a. Small sample size with larger studies required for validation, b. A Ct value was used to determine viral load which can only provide an estimate, and c. The effect of the variant on protein function needs to be performed. |
Zhang | GRCh37 | COVID Human Genetic Effort (HGE) | not specified | Various nationalities from Asia, Europe, Latin America, and the Middle East. Subjects enrolled in clinical trials across France, French Guiana and Italy.
Patients with life-threatening COVID-19 pneumonia requiring ICU admission (Death in 13.9% associated with COVID-19). Control population were asymptomatic or developed mild disease. | Not specified | 659 | 534 |
Not specified | 51.8± 15.9 | 25.5% female; 74.5% | Critical | Nasopharyngeal swab / Whole blood | Whole exome and whole genome sequencing of subjects and controls using Illumina NovaSeq6000 system. Variants analysed from 13 gene loci known to affect type I IFN pathways. Predicted loss of function (LOF) mutations further assessed in vitro for expression and functionality. | GATK was used to analyse WES. Read alignment analysed with Burrows–Wheeler Aligner software and Picard for QC. Variants curated
using Integrative Genomics Viewer (IGV) and confirmed to affect the main functional protein isoform. HMZDelFinder and CANOES algorithms used to detect deletions. Logistic regression with the likelihood ratio test used to compare cases and controls with loss of function variants with PCA used to account for ethnic heterogeneity in PLINK 1.9 software. | NA | Variants were classified according to genotype. Of the 24 LOF variants indicated, supplementary data provided RSID numbers for 6 variants only with the HGVS allocations missing for others. |
Zhu | GRCh38 | Universities, Hospitals and Institutes in China | EAS | COVID-19 respiratory disease and hospitalization cases (Wuhan Union Hospital) between January 15 and April
4, 2020 | Chinese | 466 | 0 |
288 individuals with at least one comorbidity including hypertension (N = 180, 38.63%), diabetes (N = 95, 20.38%), and coronary heart disease (N = 63, 13.52%) | 23-97 years (20–39 (8.5%), 40–59 (31.1%), 60–79 (51.1%), and 80–99 (9.2%)) | 237 female; 229 male | Severe | TBD | Samples sequenced for genotyping using the DNBSEQ platform at a mean sequencing depth of 17.8x. GWAS aanlysis was performed for all laboratory traits to discover significant associations. The COVID-19 HGI round 5 meta-analysis results were used to study susceptibility and severity. One and two sample Mendelian randomization analyses was performed to determine causal effects between laboratory traits and diseases status. Gene set enrichment analysis was also performed. | For genotyping, samples with a call rate of <0.99, closely related individuals identified by identity-by-descent (IBD >0.1) calculated in KING, and (iii) outliers identified by principal component analysis based on three-sigma rules were excluded from further analysis. Standard quality control criteria for genetic variants was applied by removing those with a SNP call rate <0.99, minor allele frequency (MAF) < 0.01, and Hardy-Weinberg equilibrium p value < 1E-06. PLINK v2.0 was used to perform single-variant GWAS analyses using a linear regression model for the quantitative laboratory features under the assumption of additive allelic effects of the SNP dosage. Age, sex, and the top six principal components (PCs) of genetic ancestry were normalized and the resulting residuals applied using a Z-score normal transformation. The number of PCs was chosen by using EIGENSTRAT software. A genome-wide significance threshold of 5E-08 and a study-wide significance threshold of 6.41E-10 (=5E-08/78 | Imputation was performed with Beagle v4.0 taking GL as input in EAS population of 1,000 Genomes Project (1KGP) as reference panel. | 1. Small sample size numbers and small genetic effect sizes resulted in no genome-wide signals associated with severe status, 2. Only one valid SNPs were associated with two genetic traits that cause disease eventhough these traits are known to ne polygeneic, 3. The genetic mechanisms that mediate COVID-19 traits require deeper investigation and were only briefly explored in this study, and 4. Further study into transcriptome and proteome- wide association should be included to uncover functiona |