Population

NameGenomeConsortiumSuper PopulationPopulation DescriptionPopulation OriginCase Population SizeControl Population SizeComorbiditiesMean Median AgeSexSeveritySample SourceMethodBioinformaticsImputation DetailsLimitations
Asano GRCh37 COVID Human Genetic Effort (HGE) EUR, AFR, AMR, SAS, Individuals from seven countries Not specified 1202 331 Not sepcified 52.9 all male Critical Nasopharyngeal and Clinical characteristics Genomic DNA extracted for WGS and WES. For WES, libraries were generated with the Twist Human Core Exome Kit, the xGen Exome Research Panel IDT xGen, the Agilent SureSelect V7 kit or the SeqCap EZ MedExome kit from Roche, and the Nextera Flex for Enrichment-Exome kit (Illumina). Massively parallel sequencing was performed on the Illumina HiSeq4000 or NovaSeq6000 system. For WES analysis performed at CNAG Barcelona, Spain, capture was performed with the SeqCap EZ Human Exome Kit v3.0 (Roche Nimbl GATK best-practice pipeline was used to analyze WES data. Reads were aligned to hg19 with the maximum exact matches algorithm in the BWA. PCR duplicates were removed with Picard tools. The GATK base quality score recalibrator was applied to correct sequencing artifacts. Sample genotypes with a coverage < 8X, a genotype quality (GQ) < 20, or a minor read ratio (MRR) < 20% was filtered out. We filtered out variant sites (i) with a call rate <50% in gnomAD genomes and exomes, (ii) a non-PASS filter in the gnomAD database, (iii) falling in low-complexity or decoy regions, (iv) that were multi-allelic with more than four alleles, (v) with more than 20% missing genotypes in our cohort, and (vi) spanning more than 20 nucleotides. Variant effects were predicted with VEP and the Ensembl GRCh37.75 reference database. An enrichment analysis focusing on X chromosome genes without known inborn errors of TLR3- and IRF7-dependent type I IFN immunity and without neutralizing auto-Abs against type I IF Not specified Not specified
Asselta GRCh37 None Specified EUR Italian Southern Europe 7268 84450 Not specified Not Specified Not specified NA NA Whole exome sequencing (WES) and genome-wide microarray genotyping data analysis Expression: GTex repository and GEO database Burden analysis: LRT score, MutationTaster, PolyPhen-2 HumDiv, PolyPhen-2 HumVar, and SIFT Michigan Imputation server Reference panel: 1000G Phase 3 v5 Phasing: ShapeIT v2.r790 Software: Minimac3 Filters: r2>0.3 Analysed genetic data for only two candidate genes No confounding issues addressed This was an exploratory study that did not include any genetic data from patients infected with SARS-CoV-2, therefore there is a need for experimental data.
Baldassarri not specified GEN-COVID Multicenter Study EUR Cases and controls were drawn from the Italian GEN-COVID cohort of 1178, cases were selected according to the following inclusion criteria: i. CPAP/biPAP ventilation (230 subjects); ii. endotracheal intubation (108 subjects). As controls, 300 subjects were selected using the sole criterion of not requiring hospitalization. Exclusion criteria for both cases and controls were i. SARS-CoV-2 infection not confirmed by PCR; ii. non-caucasian ethnicity.The Spanish cohort, composed of male COVID- 19 pa Italy, Spain 1295 341 cardiac, endocrine, neurological, neoplastic 52-68 years predominantly male Severe Nasopharyngeal swab The HUMARA assay was used to establish allele sizes of the polymorphic triplet in the AR locus performed using a fluorescent PCR followed by capillary electrophoresis on an ABI3130 sequencer. Allele size was established using the Genescan Analysis software. WES data had already been obtained from the previous GEN-COVID study. Serum and plasma total Testosterone, SHBG levels in plasma and serum LH were measured following standard procedures Variants calling was performed according to the GATK4 best practice guidelines, using BWA for mapping, and ANNOVAR for annotating. A LASSO logistic regression was performed on the poly-amino acid repeats. The data pre-processing was coded in Python, whereas for the logistic regression model the scikit-learn module with the liblinear coordinate descent optimization algorithm was used. Not Specified Not Specified
Cantalupo GRCh38 Universities, Hospitals and Institutes in Italy EUR GWAS data obtained from hospitalised cohort of the COVID-19 HGI. The first replication study was performed on genetic data from the 23andMe study hospitalised cohort. The second replication study was performed on genetic data from a selected Italian hospitalised cohort. WES data was obtained from a hospitalised cohort in Southern Italy. Italy 6406 902088 not specified not specified not specified Severe NA Summary statistics from GWAS meta-analysis of severe COVID-19 cases (COVID-19 HGI) were used to analyse the 3p21.31 region previously associated with severe COVID-19. 52 selected SNPs with suggestive statistical significance and that were eQTLs for CCR5 in lung were assessed.The three SNPs with the highest eQTLs p values in lung were investigated further. To validate the association between rs35951367 and a severe form of COVID-19 disease, we use two independent cohorts of cases and controls. Th Allele and genotype frequencies were obtained from gnomAD ; eQTLs analysis was performed by using public data from Genotype-Tissue Expression (GTEx) Portal. CCR5 gene expression was obtained from NCBI Gene Expression Omnibus (GEO) Database and plotted using R2: Genomics Analysis and Visualization Platform. DUET was used to evaluate effect of missense variants on CCR5 protein. Sequece reads were alligned and mapped using BWA-mem (V0.7.17)and SAMTools (V1.8). Duplicate reads were removed with Picard (V2.18.9). SNV's and small indels were dteected using the GATK HaplotypeCaller with fubctional annotation of variants using ANNOVAR. Off-target variants and SNPs were excluded with allele frequencies greater than 1% in non-Finnish European populations of the 1000 Genomes Project, ExAC (v3) and GnomAD (v2.1.1) databases. To remove possible false positives, variants falling in genomic duplicated regions were also excluded. The set of exonic variants was filtered to remove synonymous SNVs. For t NA Not specified
Chamnanphon GRCh37 Universities and Hospitals in Thailand EAS Samples were obtained from both biobank samples and cases admitted to King Chulalongkorn University and King Chulalongkorn Memorial Hospital, Bangkok, Thailand between February 2020 to March 2021. Clinical characterostics and data were assessed and cases were classified into four conditions mild, moderate, severe and critical Thailand 212 36 Diabetes, Dyslipidemia, Chronic kidney disease, Cardiovascular disease, lung and liver disease, cancer, immunocompromised system 44 91 males 157 females Combination Nasopharyngeal swab Genomic DNA was analysed with the AxiomTM Human Genotyping SARS-COV-2 Array (Thermo Fisher Scientific) Genotype calling from intensity data file was performed with Axiom Analysis Suite (AxAS) version 5.1.1 software using default parameters. Quality control (QC) and PCA was carried out following Ricopili pipeline (Lam et al 2020). SNPs were pruned to minimize LD between SNPs with the criteria of R2 < 0.2, and the number of SNPs in the window for pruning was 200 until there were less than 100,000 SNPs. SAIGE was applied for association analysis in GWAS using a logistic mixed-effects model. LD blocks using LDBlockShow were obtained, and genes residing in the blocks which contained SNPs with statistical significance were acquired using the University of California, Santa Cruz (UCSC) Genome Browser. Genotype imputation was done for chromosomes 1–22 using Michigan Imputation Server and the reference panel used was Genome Asia Pilot (GAsP) The limitation of this study was the relatively small sample size
Chen GRCh37 Vanderbilt University Medical Center biobank EUR AFR European and African American ancestry Not specified 10599 74638 Obesity, COPD, diabetes, CVD, liver and renal disease, asthma, dyslipidemia and hypertension 52.1 EUR 37.2 AFR 51.1% male EUR, 45.9% male AFR NA NA Illumina Expanded Multi-Ethnic Genotyping Array (MEGAEX)was used for genotyping. Replication was verified using two independent datasets from UK Biobank (UKB) and non overlapping Vanderbilt University biobank (BioUV) data. Top findings in the study were analysed in the COVID-19 Host Genetics Initiative (COVID-19 HGI) meta-analysis summary statistics from the July 2, 2020 release. SNPs with an imputation info score greater than 0.4 and minor allele frequency (MAF) greater than 1% were used for further GWAS and GReX imputation. More stringent cut offs used for AFR individuals because of smaller sample size. SUGEN was used to remove known family relatedness and PRIMUS used to reconstruct non directional family networks with ERSA verifying families with more then 5 members. LD patterns analysed on HaploView V. 4.2. Genetic imputation in MEGAEX-genotyped subjects was conducted with minimac4 on the Michigan Imputation Server12 with a reference panel of Haplotype Reference. Phenotypes were extracted from electronic health records which can affect classification of cases. Pneumonia cases were based on clinical evidence and not lab based testing
COVID-19-HGI GRCh38 COVID-19 Host Genetics Initiative MID,S/EAS,AFR,AMR,EU (1) Critically ill COVID-19 cases defined as patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection and who required respiratory support or whose cause of death was associated with COVID-19, (2) the hospitalized COVID-19 group included patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection, and (3) reported infection cases group included individuals with laboratory-confirmed SARS-CoV-2 infection o Study dependent 13641 2070709 not specified-study dependent 55.3 years mean not specified-study dependent Severe Nasopharyngeal swab / Whole blood Case-control meta-analyses in three main categories of COVID-19 disease according to predefined and partially overlapping phenotypic criteria. Each individual study that contributed data to a particular analysis met a minimum threshold of 50 cases for statistical robustness. Each contributing study genotyped the samples and performed quality controls, data imputation and analysis independently, but following consortium recommendations. GWAS analysis was run using Scalable and Accurate Implementation of GEneralized mixed model (SAIGE) 51 on chromosomes 1-22 and X or PLINK. Study-specific summary statistics were then processed for meta-analysis. Potential false positives, inflation, and deflation were examined for each submitted GWAS. Standard error values as a function of effective sample size was used to find studies which deviated from the expected trend. Summary statistics passing this manual quality control were included in the meta-analysis. Variants with allele frequency of >0.1% and imputation INFO>0.6 were carried forward from each study. Variants and alleles were lifted over to genome build GRCh38, if needed, and harmonized to gnomAD 3.0 genomes by finding matching variants by strand flipping or switching ordering of alleles. If multiple matching v For genotype imputation, participants suggested to use own reference panel, existing imputation panels or use the TopMed imputation server or the Michigan imputation server when possible. Due to the participation of different studies those enriched with severe cases or studies with antibody-tested controls may disproportionately contribute to genetic discovery despite potentially smaller sample sizes. The differences in genomic profiling technology, imputation, and sample size across the constituent studies can have dramatic impacts on replication and downstream analyses (particularly fine-mapping where differential missing patterns in the reported results can muddy the signal).
COVID-19-HGI GRCh38 COVID-19 Host Genetics Initiative MID,S/EAS,AFR,AMR,EU (1) Critically ill COVID-19 cases defined as patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection and who required respiratory support or whose cause of death was associated with COVID-19, (2) the hospitalized COVID-19 group included patients who were hospitalized due to symptoms associated with laboratory-confirmed SARS-CoV-2 infection, and (3) reported infection cases group included individuals with laboratory-confirmed SARS-CoV-2 infection o Study dependent 49562 1770206 not specified- study dependent 55.3 years mean not specified-study dependent Susceptibility Nasopharyngeal swab / Whole blood Case-control meta-analyses in three main categories of COVID-19 disease according to predefined and partially overlapping phenotypic criteria. Each individual study that contributed data to a particular analysis met a minimum threshold of 50 cases for statistical robustness. Each contributing study genotyped the samples and performed quality controls, data imputation and analysis independently, but following consortium recommendations. GWAS analysis was run using Scalable and Accurate Implementation of GEneralized mixed model (SAIGE) 51 on chromosomes 1-22 and X or PLINK. Study-specific summary statistics were then processed for meta-analysis. Potential false positives, inflation, and deflation were examined for each submitted GWAS. Standard error values as a function of effective sample size was used to find studies which deviated from the expected trend. Summary statistics passing this manual quality control were included in the meta-analysis. Variants with allele frequency of >0.1% and imputation INFO>0.6 were carried forward from each study. Variants and alleles were lifted over to genome build GRCh38, if needed, and harmonized to gnomAD 3.0 genomes by finding matching variants by strand flipping or switching ordering of alleles. If multiple matching v For genotype imputation, participants suggested to use own reference panel, existing imputation panels or use the TopMed imputation server or the Michigan imputation server when possible. Due to the participation of different studies those enriched with severe cases or studies with antibody-tested controls may disproportionately contribute to genetic discovery despite potentially smaller sample sizes. The differences in genomic profiling technology, imputation, and sample size across the constituent studies can have dramatic impacts on replication and downstream analyses (particularly fine-mapping where differential missing patterns in the reported results can muddy the signal).
David GRCh38 Genetics Of Mortality In Critical Care (GenOMICC) and the International Severe Acute Respiratory Infection Consortium (ISARIC) Coronavirus Clinical Characterisation Consortium (4C) (ISARIC 4C) EUR Cohort participants were critically ill, hospitalized COVID-19 positive patients from 208 UK ICUs: 2109, patients were recruited as part of the GenOMICC project, and an additional 135 cases as part of ISARIC 4C study. Participants of mixed-ancestry were excluded. Ancestry-matched controls without a positive COVID-19 test were obtained from the UK BioBank population study and validation, 45,875 unrelated individuals of European ancestry from the 100,000 Genomes Project were used as an alternative United Kingdom 2244 11220 19% in GenOMICC and 30% in ISARIC 57.3 +- 12.1 in GenOMICC, 57.3 +- 2.9 in ISARIC 30% female GenOMICC, 34% female ISARIC Critical Nasopharyngeal swab DNA extraction, genotyping and quality control was previously described in Pairo-Castineira et al 2020 as part of the GenOMICC consortium.TMPRSS2 variants that are predicted to be loss of function, missense or inframe and indel in the database of population genetic variations GnomAD were extracted and evaluated. The rs12329760 is predicted damaging with a MAF of 0.25 in the human population and the relation between the variant and the critical COVID-19 cohort was assessed. The association was re The association between the TMPRSS2 rs12329760 variant and COVID-19 severity was assessed using logistic regression. Logistic regression with additive and recessive models was performed in PLINKv1.9, adjusting for sex, age, mean-centred age-squared, top 10 principal components (PCA performed to adjust for population stratification) and deprivation index decile based on UK postcode. Genetic ancestry was inferred using ADMIXTURE and reference individuals from the 1000 Genomes project. Each major ancestry group alternative in the 100,000 Genomes control group was performed with mixed model association tests in SAIGE (v0.39), including age, sex, age-squared, age-sex interaction and the first 20 principal components as covariates. Trans-ethnic metaanalysis of GenOMICC data for different ancestries was performed by METAL using an inverse-variance weighted method and the P-value for heterogeneity was calculated with Cochran’s Q-test for heterogeneity implemented in the same software. Meta-an Imputation was performed using the TOPMed reference panel a. The lack of access to a cohort of asymptomatic/pauci symptomatic COVID-19 patients, meant for comparison the general population was used to compare to COVID-19 severe cases.
Ellinghaus GRCh38 The Severe COVID-19 GWAS group EUR Italian and Spanish patients with severe disease defined by respiratory support and hospitalization Southern Europe 1610 2205 49% hypertension, 18.1% diabetes, 10.4% coronary artery disease 66.7 male 1096 female 514 Severe Nasopharyngeal swab Genotyping by Illumina Global Screening Array PLINK 1.9 logistic-regression for imputation uncertainty. PCA was used for GWA tests with adjustments for population stratification, age and sex. A fixed effects meta-analysis was conducted using METAL. Bayesian fine-mapping performed for loci reaching GW significance. SNP imputation using Michigan Imputation Server and 194,512 haplotypes generated by Trans-Omics for Precision Medicine (TOPMed) Limited genotype-phenotype elaboration and therefore adjustment for potential sources of bias; limited information on SARS-CoV-2 infection status in control group; exclusion of genotype samples based on ethnicity.
Fadista GRCh38 Not specified EUR COVID-19 HGI A2 cohort Not specified 4336 623902 None specified Not specified Not specified Severe Nasopharyngeal swab / Whole blood Mendelian randomization (MR) study for IPF causality in COVID-19. Genetic variants associated with IPF susceptibility from previous GWAS were used as instrumental variables on COVID-19 severity from the GWAS meta-analysis by the COVID-19 HGI Two-sample MR analysis was performed using the random-effects inverse-variance weighted method implemented in the R (version 3.6.1) package MendelianRandomization (version 0.5.0). An allele frequency of 0.001 and an imputation info score of 0.6 was applied to each study before meta-analysis according to COVID-19 HGI protocol 1) Variance explained by the use of non-MUC5B IPF genetic instruments, although within the range typical of complex traits 2) Selection bias may play a rold in the protective effect found from rs35705950 as (a) a patient group that is heavily enriched for the rs35705950 T undertaking strict self-isolation and/or (b) due to survival bias of the rs35705950 non-IPF risk allele carriers. 2) Increased sample sizes, both from the IPF or COVID-19 GWAS could also have narrowed the confidence interval
Fallerini GRCh38 GEN-COVID Multicenter Study EUR A subset of male COVID-19 patients was selected from the Italian GEN-COVID cohort of 1,178 SARS-CoV-2-infected participants (Daga et al., 2021). Cases were selected according to the following inclusion criteria: i. male gender; ii. young age (<60 years); iii endotracheal intubation or CPAP/biPAP ventilation. As controls, participants were selected using the sole criterion of being oligo-asymptomatic not requiring hospitalization. Cases and controls represented the extreme phenotypic presentation Italy 79 77 not specified <60 years Male only Severe TBD A nested case control study was performed on a subset of male participants with extreme COVID-19 phenotypes. A LASSO logistic regression analysis assessing only rare variants were considered in a boolean representation which discovered the TLR7 gene as important. By selecting for young males, rare (MAF < 1%) TLR7 missense variants predicted to impact on protein function (CADD scores) were discovered and in none of the SARS-CoV2 infected oligo-asymptomatic male participants. In order to functio The principal components analysis (PCA) was applied prior to the LASSO logistic regression in order to remove samples that were clear outliers. A 10-fold cross-validation method was applied in order to test the performances of the LASSO logistic regression. To determine the significance of the association between TLR7 variants and COVID severity, the Fisher’s Exact Test was used. p Values < 0.05 were considered statistically significant. NA Not specified
Freitas GRCh38 Universities, Hospitals and Institutes in Portugal EUR COVID-19 lab confirmed positive individuals from two hospitals (Santa Maria and Sao Joao hospital) Portugal 491 0 Hypertension (63.1%), diabetes (31.8%) and obesity (23.4%), Vitamin D deficient (63.3%), Vitamin D insufficient (24.4%), disclosure of other diseases included in supplementary data 69.7 ±15.8 217 female; 266 male Severe Nasopharyngeal swab Several SNPs from previous GWAS studies (European individuals) that play an important biological role in vitamin D metabolism, transport, degradation, and downstream pathways have been identified to affect Vitamin D levels. To understand if an association exists between the polymorphisms in the vitamin D-related genes and the disease severity, four polygenic risk scores (PRSs) were defined. iPLEX® MassARRAY® system used to assess genotype of patients. A polygeneic risk score was estimated for each individual according to their genotype profiles. Statistical tests (Mann–Whitney, Kruskal–Wallis and Spearman rank correlation coefficient) were used to find differences in genetic variants in vitamin D-related genes between COVID-19 patients with different degrees of disease severity. No imputation performed Only hospitalized cases assessed
Gomez not specified Hospitals, Universities and Institutes in Spain EUR Cohort requiring hospitalisation with 67 patients in need of critical care support, including high-flow oxygen, positive-pressure ventilation (either invasive or non-invasive) or vasoactive drugs. Asturias, Northern Spain 204 536 Hypertension (48%), diabetes (18%), hypercholesterolaemia (34%) 64.77 61% male Severe Nasopharyngeal swab The I/D polymorphism (rs4646994) in intron 16 of the ACE gene was genotyped by PCR followed by agarose gel electrophoresis to visualise the two alleles. For the ACE2 rs2285666 A/G SNP the PCR fragments were digested with the restriction enzyme AluI and electrophoresis on agarose gels. The ACE2 coding exons of 60 male patients (30 severe and 30 nonsevere) were amplified with primers designated from exon flanking introns. These fragments were sequenced with Sanger BigDye chemistry in a capillary A The statistical analysis was performed with the R-project free software (www.rproject.org). The logistic regression (linear generalized model, LGM) was used to compare mean values and frequencies between the groups. NA a. Small sample size and low number of severe female cases limited the statistical interpretation of the significant and non-significant associations, b. Individuals exposed but asymptomatic were not studied
Horowitz GRCh38 AncestryDNA: Ancestry, UK Biobank: UKB, Geisinger Health System: GHS, Penn Medicine BioBank: PMBB EUR, AFR, AMR, SAS Four ancestries (Admixed American, African, European and South Asian) defined as two groups of COVID-19 outcomes: five phenotypes related to disease risk and two phenotypes related to disease severity among COVID-19 cases Unspecified 5461 661632 19% hospitalized 7% severe disease. Comorbidities included hypertension, Cardiovascular disease, type 2 diabetes, chronic kidney disease, asthma, COPD Ancestry 52.49, UKB 56.3, GHS 58.2, PMBB 55.65 >50% Female Severe Nasopharyngeal swab / Whole blood Ancestry: Illumina genotyping array UKB: Applied Biosystems UK BiLEVE Axiom Array and UK Biobank axiome array PMBB: Illumina Global screening array GHS: Illumina OmniExpress Exome or Global Screening Array Replicated eight independent associations (r2<0.05) previously reported. Association analyses- genomewide Firth logistic regression test implemented in REGENIE. Results meta-analyzed across studies and ancestries using an inverse variance-weighed fixed-effects meta-analysis. Ancestry: Haplotype Reference Consortium (HRC) reference panel. Haplotypes: Eagle version 2.4.1 Software: Minimac4 version 1.0.1. UKB: HRC panel, UK10K and 1000 Genomes Project phase 3 panels. PMBB: TOPMed reference panel and TOPMed Imputation Server. GHS: TOPMed reference panel and TOPMed Imputation Server. Most participants from EUR ancestry
Horowitz GRCh38 Ancestry, UKB, GHS, PMBB EUR, AFR, AMR, SAS Four ancestries (Admixed American, African, European and South Asian) defined as two groups of COVID-19 outcomes: five phenotypes related to disease risk and two phenotypes related to disease severity among COVID-19 cases Unspecified 11356 651047 Comorbidities included hypertension, Cardiovascular disease, type 2 diabetes, chronic kidney disease, asthma, COPD. Ancestry (52.49), UKB (56.3), GHS (58.2), PMBB (55.65) >50% Female Susceptibility Nasopharyngeal swab / Whole blood Ancestry: Illumina genotyping array UKB: Applied Biosystems UK BiLEVE Axiom Array and UK Biobank axiome array PMBB: Illumina Global screening array GHS: Illumina OmniExpress Exome or Global Screening Array Replicated eight independent associations (r2<0.05) previously reported. Association analyses- genomewide Firth logistic regression test implemented in REGENIE. Results meta-analyzed across studies and ancestries using an inverse variance-weighed fixed-effects meta-analysis. Ancestry: Haplotype Reference Consortium (HRC) reference panel. Haplotypes: Eagle version 2.4.1 Software: Minimac4 version 1.0.1. UKB: HRC panel, UK10K and 1000 Genomes Project phase 3 panels. PMBB: TOPMed reference panel and TOPMed Imputation Server. GHS: TOPMed reference panel and TOPMed Imputation Server. Most participants from EUR ancestry
Horowitz_2 GRCh38 Regeneron EUR, AFR, SAS, EAS, Cases were obtained from four studies (Geisinger Health System (GHS), Penn Medicine BioBank (PMBB), UK Biobank (UKB) and AncestryDNA) from five super population groups and grouped into five case-control comparisons related to the risk of infection and two others related to disease severity among cases with COVID-19. Study dependent 52630 704016 not specified not specified not specified Combination NA Both common (minor allele frequency (MAF) > 0.5%, up to 13 million) and rare (MAF < 0.5%, up to 76 million) variants across the seven risk and severity phenotypes were considered Ancestry-specific GWAS was performed in each study using the genome-wide Firth logistic regression test implemented in REGENIE V2.0.1. Firth’s approach is applied when the P value from the standard logistic regression score test is below 0.05. Directly genotyped variants with an MAF > 1%, <10% missingness, Hardy–Weinberg equilibrium test P > 1 × 10?15 and LD pruning (1,000 variant windows, 100 variant sliding windows and r2 < 0.9) were included. Covariates for age, age2, sex, age-by-sex and the first 10 ancestry-informative PC were also included. Results were subsequently meta-analyzed across studies and ancestries using an inverse variance-weighted fixed-effects meta-analysis. AncestryDNA used the Haplotype Reference Consortium (HRC) reference panel and performed imputation with Minimac4 v.1.0.1. GHS and PMBB used the TOPMed reference panel using the TOPMed Imputation Server. 1. Greater power to identify associations with disease risk than with severity outcomes due to relatively small sample size for the latter, 2. Phenotypic heterogeneity among cases with COVID-19 and controls and associated risk factors due to four seperate studies with different collection variables. Ancestry DNA composed of more healthier individuals with milder COVID-19 compared to UKB, GHS and PMBB studies which collected in a clinical setting so were enriched for more severe cases.
Hu GRCh37 Not specified EUR COVID-19 positive cases including 292 deaths of participants from the UK Biobank (UKB) Northern Europe 1096 0 Not specified Not specified Not specified Severe Nasopharyngeal swab UK Biobank Axiom Array. Imputed SNPs used to perform a GWAS using super variants in statistical genetics to identify potential risk loci contributing to the COVID-19 mortality. Local ranking and aggregation used to identify super variants using a four step method which included two modes of transmission (recessive and dominant). This method was used in a discovery and validation identification. Logistic regression, replicated 10x for stability, was then used to investigate super variant associations with the death outcomes of COVID-19. Cox regression was used to futher validate supervariants verified in multiple runs. Haplotype Reference Consortium and UK10K and 1000 Genomes reference panels Role of super variants in COVID-19 susceptibility not validated. Comorbidities were not accounted for in the association analysis eventhough UKB COVID-19 hospitalised patients had comorbidities. Study was restricted by sample size. Environmetal influencers not factored. No other ethnicities were included in this study.
Hubacek not specified Institutes and Universities in the Czech Republic EUR 246 symptomatic (without requiring hospitalisation and 164 asymptomatic COVID-19 positive cases during the first wave (app. March 2020 – June 2020) of the disease in the Czech Republic. All cases completely recovered with no adverse events. Czech Republic 410 2559 7.8% diabetes, 13.3% hypertensive 44 ± 15 54.7% female Mild TBD DNA was isolated from EDTA-treated blood. The ACE I/D polymorphism was genotyped and PCR products of ~ 490 bp and ~ 200 bp characterise the I and D alleles, respectively. All D/D subjects were re-genotyped with ACE I-specific oligonucleotides to avoid the misgenotyping of some I/D heterozygotes as D/D homozygotes. Statistical analysis was performed using the www.socscistatistics.com tools NA The ACE I/D polymorphism effect on COVID-19 outcomes have varied outcomes from different papers (Gomez et al 2020 showed the D/D genotype affecting COVID-19 severity outcomes)
Jelinek GRCh37 Universities and hospitals in the UAE SAS, EAS, AFR, EUR, Patients with COVID-19 were recruited from multiple recruitment sites across the UAE. Only patients who tested positive for SARS-CoV-2 by RT-PCR were included. The participants were divided into two groups based on the severity of COVID-19, indicated as noncritical (n = 453) or critical (n = 193). Participants were defined as critical COVID-19 cases, if they are admitted to the ICU with the use of oxygen supplementation or mechanical ventilation. Region of origin of participants included Middle Abu Dhabi 193 543 Comorbidities were defined as a Yes/No for previous medical diagnosis of diabetes mellitus, hypertension, cardiac disease, lung disease, liver disease, kidney disease, metabolic disorder, and/or an autoimmune disease 1-85 138 female, 508 male Critical Whole blood Genotyping was performed using the Infinium Global Screening Array (Illumina Incorporation, San Diego, California, USA) QC on the data was performed using the PLINK software (version 1.07) to exclude SNPs with a low minor allele frequency (<0.01), low genotyping rate (<95%), and deviation from Hardy–Weinberg equilibrium (p < 10?4) significance level. A total of 240 SNPs in the ABO gene were extracted for the association of this study for candidate gene analyses. Statistical analysis was performed using PLINK software (version 1.9), R software (version 3.4), and SPSS software (version 16.0). Bivariate and multivariate logistic regression analyses were used to estimate OR and p-values of the association between blood type and COVID-19 severity phenotypes. Two candidate gene association tests were conducted that included unadjusted analysis and adjustment on the top ten eigenvectors for population stratification, age, and gender. Significance level adopted for all the analyses was p ? 0.05. Genotype data phased and imputed using the Phase 3 1000 Genomes Projects panel a. Small sample size, b. Selection bias based on presentation to hospital and multiple collection sites, c. Substantial genetic admixture however population stratification was taken into account, d. GWAS array based on Caucasian population used
Kuo Not specified Not specified EUR England Northern and Western Europe 622 322948 APOE e4e4 risk allele Dementia (0.22%);Hypertension (32.42%);Coronary artery disease (8.7%);Type 2 diabetes (5.36%) 68 ± 8 55% (176951) female Severe Nasopharyngeal swab UK Biobank participants, genotyped for ApoE (UK Biobank axiom array) with a positive COVID-19 result were compared to participants who did not test positive over the period 16 March - 26 April 2020. A logistic regression model was used to compare e4e4 genotypes to e3e3 for COVID-19 positivity status, adjusted for sex and age. Genotyping array type and the top five genetic principal components (accounting for possible population admixture). Not specified Letter to the editor with little detail on methods and population specifics. Authors focused on two variants only.
Latini GRCh38 Not specified EUR 131 Italian COVID-19 positive patients. Southern Europe 131 5341 not specified Median age was 63.7 82 males, 49 females TBD Whole blood WES using Twist Human Core Exome Kit and sequenced on the Illumina NovaSeq 6000 platform. Four genes (TMPRSS2, PCSK3, DPP4, and BSG) were analysed and allelic frequency compared to the EUR GnomAD reference population. Illumina BaseSpace pipeline and TGex software used for the variant calling and annotation (30x coverage). Variants were examined for coverage and Qscore using the Integrative Genome Viewer. Protein effect prediction: PolyPhen2, Mutation Taster, SIFT, MetaLR_pred, MetaSVM_pred. not specified Require larger independent cohorts as well as functional studies to evaluate the effect of the detected genetic variant. The number of subjects is too small to stratify them on the basis of clinical characteristics and clinical phenotypes.
Li GRCh37 Universities, Hospitals and the CDC in China EAS Severe or critical COVID-19 cases and mild or moderate control cases were obtained from two cohorts (Huoshenshan and Union hospitals in Wuhan, China). For validation of allele frequencies of the significantly associated SNPs 954 COVID-19 unknown controls of Chinese Ancestry and 2504 Chinese Ancestry individuals from the 1000 genomes project (Phase 3, November 2014) were also included. China 885 546 Hypertension, diabetes, coronary artery diseases, chronic hepatitis B, chronic obstructive pulmonary disease, chronic renal diseases, and cancer (statistics not specified) not specified not specified Severe Nasopharyngeal swab Genomic DNA was extracted from peripheral whole blood using the QIAamp DNA blood kits (Qiagen). Quality of the isolated genomic DNA was verified using two methods: (1) DNA degradation and contamination on 1% agarose gels; and (2) DNA concentration measured using a Qubit DNA Assay Kit and a Qubit 2.0 Fluorometer (Life Technologies, MA, USA). The Affymetrix Axiom® World Arrays was used for genotyping. Genotype callings were performed using Axiom Analysis Suite. SNPs were excluded from further analyses if they were not in chr 1–23 or X, had call rates <90% among all subjects in this study, deviated significantly from HWE among all subjects in this study or had minor allele frequencies (MAF) < 1%. A total of 558,642 SNPs were finally retained. To identify the ancestry outliers, PCA was performed using EIGENSOFT (v3). Autosomal SNPs were used for PCA based on the following criteria: call rate >90%, HW P>0.0001, MAF >10% and LD-pruned r2<0.10. Twenty principal components were estimated for all the cases and controls. PCA analyses was performed on the same SNPs for samples from the 1000 Genomes Project and the COVID-19 unknown controls. GWA tests were performed on SNPTEST software using logistic regression models adjusting for covariates and the top five PCAs. eQTL analysis to determine the causative genes were performed using the QTLbase, GTEx v8, Immunopop QTL browser and two independe Imputation was performed using SHAPEIT (v2) and IMPUTE (v4). A prephasing strategy for the Huoshenshan and Union cohorts was performed by SHAPEIT, using the 1K Project data (Phase 3, November, 2014) as the reference based on hg19. IMPUTE4 was used to impute the phased haplotypes constructed by SHAPEIT. For the imputation of chr X, males were coded as diploid in non-pseudoautosomal regions. SNPs with IMPUTE4 info scores below 0.6 or MAF < 0.01 were excluded. A separate independent imputation anal Further functional studies required for the roles of the two loci in COVID-19 pathogenesis
Lu GRCh37 Not specified EUR, SAS, AFR UK Biobank COVID-19 positive participants, 180 with COVID-19 related mortality Not specified 180 1141 Hypertension, Diabetes, Cholesterol Not specified Male 127 Female 53 Critical Nasopharyngeal swab GWAS combined with functional ontology and deleteriousness statistics PLINK 2.0 quality control. GWAS with Holm-Bonferroni multiple hypothesis correction performed. Two gene lists were created using either a UniProt database list or an in house created list. The in house list was created using Hidden Markov models (HMM) and protein coding to leverage functional gene ontology with the FATHMM program analysing amino acid deleteriousness and phenotype prediction. To increase power of association the GWAS p-values and the HMM p-values were combined using Fishers method. Not specified Early study with a low amount of data
Medetalibeyoglu not specified University of Istanbul SAS The patients were included between April–June 2020 that admitted to the COVID-19 center of a single university hospital with a number of criteria used to determine severe vs mild cases. Istanbul, Turkey 284 100 Hypertension 72 (24.5%), Diabetes mellitus 36 (12.7%), Chronic obstructive pulmonary disease 30 (10.6%), Coronary artery disease 17 (6.0%), Congestive heart failure 5 (1.8%), Solid malignancy 26 (9.2%), Hematological malignancy 7 (2.5%) 49 45.1% Female Severe Whole blood DNA was extracted from the collected blood by using the Genemark isolation kit (Genemark, USA). PCR was used to amplify the MBL2 gene and a restriction enzyme BanI was used to identify the codon 54 polymorphism. All of the patients and controls were examined for the codon 54 A/B (gly54asp: rs1800450) variation in exon 1 of the MBL2 gene. IBM SPSS version 21.0 was used for statistical analysis. Multivariate binary logistic regression analyses to find out the association between different genetic variants of the MBL2 gene with study parameters. The results were adjusted for age and sex. Statistical significance was accepted as p < 0.05 for the results of all analyses. NA a. Small population size, statistical inference is limited
Monticelli GRCh38 GEN-COVID Multicentre Study EUR Patients from 40 Italian Hospitals, 16 Continuity Assistance Special Unit (USCA) and 8 Departments of Preventive Medicine led by Professor Alessandra Renieri of the University of Siena in Italy South-Central Europe 1177 0 Diabetes, and/or hypertension, and/or obesity not specified not specified Severe NA Whole-Exome Sequencing (WES) data derived from the GEN-COVID Multicenter Study were analysed. All calculations were performed using R. The odds ratio (OR) function from the Epitools package was used to determine odds ratios. Pymol was used to visualise protein models. PolyPhen-2 was used to determine missense variant effect prediction. NA No controls were used. Participant information not detailed.
Namkoong GRCh37 Japan COVID-19 Task Force EAS Japanese COVID-19 hospitalised patients consisting of 990 critical individuals requiring artificial respiratory support or intensive care. 1391 with non severe disease and 12 with unknown effect. Severe cases were older in age and predominantly male gender. Controls included the general Japanese population Japan 2393 3289 Other than age >65, no comorbidities were described for the case or control populations. 56.0 ± 18.9 64.2% male, 35.8% female Severe Nasopharyngeal swab / Whole blood Illumina Infinium Asian Screening Array used for cases and controls and SNP results used for the GWAS analysis. PCA analysis was applied to remove outliers and non Japanese individuals. GWAS conducted using logistic regression for each variant using PLINK2 software. Meta-analysis of the Japanese discovery GWAS and the pan-ancestry analysis was conducted using an inverse-variance method assuming a fixed-effects model. Logistic regression and R statistical software used for ABO blood group analysis. Minimac4 software used for imputation. A Japanese population-specific imputation reference panel combined with 1000 Genomes Project Phase used. QC filters of MAF ? 0.1% and imputation score >0.5 applied. Where lead variants were obtained by imputation, accuracy was assessed using WGS data. HLA genotype imputation was performed using DEEP*HLA software (version 1.0). The reference panels are used to impute the data do not seem to be uploaded to a repository, which would be crucial for other studies looking to include East Asian genetic diversity into panels for populations with high levels of admixture including people of East Asian origin. The article has not yet been peer-reviewed as of entry into this database.
Pairo-Castineira GRCh37 GenOMICC EUR critically ill patients with COVID-19 in the UK population Not specified 1676 8375 28% significant co-morbidity 57.3 ± 12.1 70% male; 30% female Critical Nasopharyngeal swab / Whole blood Genotyping with Illumina Global Screening Array v.3.0. In some cases genotypes and imputed variants were confirmed with Illumina NovaSeq 6000 WGS. Variants were validated using a GWAS of genetic studies with 100 000 genomes and Generation Scotland datasets. DRAGEN pipeline used for variant calling. Variants were genotyped with the GATK GenotypeGVCFs tool v.4.1.8.150 and annotated with bcftools v.1.10.2. PLINK 1.9 was used for quality control and association tests. King 2.1 used to remove duplicate individuals with gcta 1.9 used for PCA. Genetic ancestry was inferred using PCA. TOPMed reference panel with BCFtools 1.9. and QCtools 1.3 used for quality control The critical sample size was small. Thus, replication of results was sought in the HGI hospitalised COVID-19 vs population analysis, with duplicated samples excluded.
Pairo-Castineira GRCh37 COVID-HGI and 23andMe Inc EUR European ancestry hospitalised patients from COVID-19 HGI and those with a broad respiratory phenotype from 23andMe Inc. Not specified 3543 1157272 Not specified not specified not specified Severe Nasopharyngeal swab / Whole blood Replication analysis was performed using the loci in GenOMICC individuals. Replication analyses were performed using HGI build 37, version 2 ( July 2020) B2 (hospitalized patients with COVID-19 versus the population) GWAS and those defined with respiratory phenotypes in 23andMe Inc. Summary statistics were used from the full analysis, including all cohorts and GWAS without UK Biobank, to avoid sample overlap. Meta-analysis of the GenOMICC, HGI and 23andMe datasets was performed using fixed-effect inverse-variance meta-analysis in METAL. Not specified Effect sizes are likely to be greater in the GenOMICC study because the cohort is strongly enriched for immediately life-threatening disease. Further studies needed to narrow down the loci found.
Pehlivan not specified University of Istanbul SAS Patients diagnosed with COVID-19 between April and June 2020 who were admitted to the COVID-19 center of a university hospital, and 100 healthy individuals without any known were included as controls. Healthy controls consisted of individuals who were negative for Sars-CoV-2 antibody (Sars-Cov-2 IgM, IgG) and were negative in two PCR results taken with an interval of 48 hours. Patients included those that were diagnosed with COVID-19 who show false negativity with initial examinations before hos Istanbul, Turkey 70 100 not specified not specified not specified Susceptibility Nasopharyngeal and Clinical characteristics PCR and/or RFLP was used to isolate DNA samples from blood leukocytes at the time of diagnosis. MBL2-rs1800450, NOS3-rs1799983 and NOS3-intron 4 VNTR gene polymorphisms were analyzed. Logistic regression was used to analyse statistical differences analysis between groups. The adjusted odds ratios (ORs) were calculated with a logistic regression model that checked for sex and age and are reported with 95% confidence intervals (CIs). Differences between the patients’ group were compared using the chi-square test and the Fisher exact test when required. A p-value of less than 0.05 was accepted as significant. NA a. Limited patient population, b. No MBL serum levels were recorded
Peloso GRCh37 Universities and Hospitals in the United States EUR, AFR, AMR White, Black (African American) and Hispanic participants were obtained from the VA Million Veteran Program. Participants formed part of four groups 1) COVID-19 positivity as defined by positive COVID-19 test compared with all other MVP participants (POS vs. POP); 2) individuals who were hospitalized for COVID-19 compared with all other MVP participants, including individuals who tested positive for COVID-19 but were not hospitalized (HOS vs. POP); 3)individuals who were hospitalized for COVID-1 United States 19168 492854 none specified 62.3 92.25 Combination Whole blood Genotyping using a customized Affymetrix Axiom Biobank Array. Single variant association was performed between imputed variants and four participant outcomes. Population-specific principal components (PCs) were computed using EIGENSOFT v.6. The harmonized race/ethnicity and genetic ancestry (HARE) approach was used to assign individuals to three mutually exclusive groups: 1) non-Hispanic White (White), 2) non- Hispanic Black (Black), and 3) Hispanic or Latino (Hispanic) . Kinship was inferred using KING v.2.0 . For each pair of relatives (kinship coefficient ?0.0884), one individual was excluded, preferentially retaining those who tested positive for SARS-CoV-2. Logistic regression was applied in PLINK v2 adjusting for age, age2, sex, age*sex, and 15 populationspecific PCs analysed within each of the HARE-assigned groups. COVID-19-HGI summary statistics (Release 5) for the multi-population and the White-only meta-analyses, excluding MVP and 23&Me data, were used for replication. The association between ABO blood type and COVID-19 performed using logistic regression adjusted for age and sex in four COVID outcomes. Variants with population-spe Imputation was performed to a hybrid imputation panel comprised of the African Genome Resources panel and 1000 Genomes (p3v5). a. Limited power for genome-wide analyses in severity COVID-19 outcomes, b. Predominance of males in sample cohort
Roberts GrCH37 Ancestry DNA EUR, AMR European ancestry- (65%) Admixed African-European ancestry includes 100% African ancestry-(6%) Admixed Amerindian ancestry also includes 100% Amerindian ancestry-(11%) Other including Admixed East Asian-European ancestry also includes 100% East Asian ancestry- (18%) Northern America 2417 14993 Asthma, Cardiovascular disease, diabetes, hypertension, autoimmune disease, rare health conditions (proportions specified in supplementary data) 56 862 male; 1555 female Susceptibility Nasopharyngeal swab GWAS; Illumina genotyping array Linkage disequilibrum (LD)-pruned using PLINK 1.9 COVID-19 Host Genetics Initiative analysis plan version 1 using PLINK 2.0 for GWAS analysis Haplotype Reference Consortium (HRC) reference panel Haplotypes : Eagle version 2.4.1 Software: Minimac4 (MAF)>0.01 R2>0.30 Participants required to fill out a survey therefore cases that were very severe or fatal would not be included Participants restricted to European ancestry due to small numbers from other ancestral groups No independent replication cohort to determine if results are reproducible Did not assess indels
Roberts GRCh37 Ancestry DNA EUR, AMR European ancestry (65%) Admixed African-European ancestry includes 100% African ancestry (6%) Admixed Amerindian ancestry also includes 100% Amerindian ancestry (11%) Other (18%) including Admixed East Asian-European ancestry also includes 100% East Asian ancestry Northern America 250 1967 Asthma, Cardiovascular disease, diabetes, hypertension, autoimmune disease, rare health conditions (proportions specified in supplementary data) 56 105 male, 145 female Severe Nasopharyngeal swab GWAS (Illumina genotyping array) Linkage disequilibrum (LD)-pruned using PLINK 1.9 COVID-19 Host Genetics Initiative analysis plan version 1 using PLINK 2.0 for GWAS analysis Haplotype Reference Consortium (HRC) reference panel Haplotypes : Eagle version 2.4.1 Software: Minimac4 (MAF)>0.01 R2>0.30 Participants required to fill out a survey therefore cases that were very severe or fatal would not be included Participants restricted to European ancestry due to small numbers from other ancestral groups No independent replication cohort to determine if results are reproducible. Did not assess indels
Sayin GRCh38 Istanbul Training and Research Hospital, Istanbul University not specified 200 patients diagnosed with COVID-19 in pandemic clinics of Istanbul University, Faculty of Medicine, between 1 April and 1 June 2020, as well as 100 volunteers without known comorbidities. Gene polymorphism distribution and statistical significance were examined in the following patient groups: (1) severe/mild infection, (2) exitus/ alive during the 28-day follow-up, (3) presence of the need for intensive care/being only inpatient. Istanbul, Turkey 200 100 26% hypertension 49 median age 87 female; 113 male Severe Nasopharyngeal swab DNA isolation from peripheral blood leukocytes of COVID-19 patients and healthy controls was performed using the saline precipitation method. PER3 gene polymorphism genotypes were analyzed by the PCR method. IBM SPSS version 21.0 (IBM Corp. USA) for all statistical analyses. P<0.05 was considered statistically significant. NA When divided into clinical subgroups, the statistical analysis may have been biased due to the decreased number of patients, and no significant results were obtained in terms of the 5R/5R genotype.
Shelton GRCh37 23andMe COVID-19 team EUR, AFR, AMR COVID-19 positive individuals from the 23andMe study consisting of 80.3% EUR, 11.3% Latino and 2.7% African American. Control size depends on phenotype analysed. 93.2% United States, 2.4% United Kingdom and 4.4% various countries around the world 12972 101268 COVID-19 positive test: Type 2 Diabetes- 5.2%, Fatty liver disease-4.8%, Obesity-37.1%, Hypertension-24.3%. COVID-19 positive test + hospitalisation: Type 2 Diabetes-13.3%, Fatty liver disease-9.7%, Obesity-52.6%, Hypertension-42.8%. 51 63% female, 37% male Susceptibility Nasopharyngeal swab Samples genotyped on either the Illumina HumanHap550 BeadChip, Illumina OmniExpress BeadChi or Illumina Global Screening Array each containing customized SNPs or array. GWAS analysis performed on each phenotype and population group. One susceptibility phenotype for COVID-19 positive vs negative participants and 4 severity phenotypes were analysed (pneumonia, hospitalisation, respiratory support with supplemental oxygen and or ventilation). Case control analysis used logistic regression and P values computed using the likelihood ratio tests. GWAS was performed on each phenotype and population cohort seperately. Trans-ancestry meta-analysis performed with a fixed effects model (inverse variance method). Haplotype Reference Consortium panel, augmented by the Phase 3 1000 Genomes Project panel for variants not present in the Haplotype Reference Consortium. Self-reported data from an existing groups which is not reflective of the general population Scarcity of testing could have obscured the true picture of SARS-CoV-2 infections, misclassification of true infections (bias)
Shelton GRCh37 23andMe COVID-19 team EUR, AFR, AMR COVID-19 positive individuals from the 23andMe study consisting of 80.3% EUR, 11.3% Latino and 2.7% African American. 93.2% United States, 2.4% United Kingdom and 4.4% various countries around the world 2083 797180 COVID-19 positive test: Type 2 Diabetes- 5.2%, Fatty liver disease-4.8%, Obesity-37.1%, Hypertension-24.3%. COVID-19 positive test + hospitalisation: Type 2 Diabetes-13.3%, Fatty liver disease-9.7%, Obesity-52.6%, Hypertension-42.8%. 51 63% female, 37% male Severe Nasopharyngeal swab Samples genotyped on either the Illumina HumanHap550 BeadChip, Illumina OmniExpress BeadChi or Illumina Global Screening Array each containing customised SNPs or array. GWAS analysis performed on each phenotype and population group. One susceptibility phenotype for COVID-19 positive vs negative participants and 4 severity phenotypes were analysed (pneumonia, hospitalisation, respiratory support with supplemental oxygen and or ventilation). Case control analysis used logistic regression and P values computed using the likelihood ratio tests. GWAS was performed on each phenotype and population cohort seperately. Trans-ancestry meta-analysis performed with a fixed effects model (inverse variance method). Haplotype Reference Consortium panel, augmented by the Phase 3 1000 Genomes Project panel for variants not present in the Haplotype Reference Consortium. Self-reported data from an existing groups which is not reflective of the general population Scarcity of testing could have obscured the true picture of SARS-CoV-2 infections, misclassification of true infections (bias)
Speletas not specified Universities and hospitals in Greece EUR, EAS Participants were enrolled in the study from March to October 2020. The ethnicity of patients included 152 Greeks, 62 Turks, 16 Ukrainians, 15 Indonesians, 6 Uzbeks, 3 Moldovans, 3 Americans, 2 Cubans and 1 patient from each of the following countries: Albania, Belarus, Bulgaria, Germany, and Kyrgyzstan. Most patients of non-Greek origin were passengers and crew members on a cruise ferry who became infected with COVID-19 in March 2020. Greece 264 none 35.6% with one or more comorbidities including obesity, hypertension, chronic heart disease, chronic rspiratory disease, dyslipidemia, diabetes, hypothyrodism, malignancies, live/renal/hematological disease 42.8 ± 18.4 years 180 male; 84 female Severe TBD DNA was extracted from peripheral blood, with the detection of MBL2 genetic alterations performed by allele-specific polymerase chain reaction, followed by restriction fragment length polymorphism (PCR-RFLP) analysis Multivariate analysis was performed in the form of binary logistic regression for all parameters, with a statistical significance of p > 0.2 in univariate analysis. For all the analyses, a 5% significance level was set. Analysis was carried out with SPSS (version 25.0). NA a. No association of the known MBL genetic deficiency to COVID-19 severity but only of the B allele (rs1800450) itself. This could indicate that the results may be affected by the small sample size , diverse composition of the cohort analysed, confounding factors such as smoking, diet and oxidative stress levels that affect MBL serum levels
Upadhyai GRCh38 University Institutes in India EUR European individuals from the AncestryDNA COVID-19 host genetic study European ancestry 1492 197 Not specified but according to AncestryDNA data (Roberts et al., 2020) not specified not specified Severe NA GWAS was performed using the original AncestryDNA COVID-19 genotyping dataset (EGA Accession no. EGAD00010002012) with 675,370 SNVs to identify genetic variants that show significant frequency variation between asymptomatic versus severely infected COVID-19 patients. "PCA analysis was performed using PLINK v1.9 with only individuals of European ancestry being used for further analysis. For QC, data with high levels of missingness (>20%) were filtered out using PLINK v1.9 with a MAF threshold of 0.01. Standard case-control-based association analyses were performed in PLINK v1.9 with a multiple-testing corrected p-value < 0.001 being considered significant. Significant SNVs (multiple-testing corrected p-value <0.001) were annotated using SNPnexus web-based server for GRCh38. not specified a. Limited availability of data from COVID-19 patients, b. For the SARS-CoV-2 infected cohort, patients were categorized into various COVID-19 outcomes based on a self-reported questionnaire which may have affected the likelihood of miscategorization, c. the unavailability of genomic data for COVID19 patients in the ICU or those who may have succumbed might cause some discrepancies in the final outcomes of the study and, d. GWAS findings may not be exclusively ascribed to SARS-CoV-2 infections
van Blokland GRCh38 Various research groups and Universities EUR, AMR Individuals from the Generation Scotland, Helix cohort (Helix DNA Discovery project) in the United States, Lifelines COVID-19 cohort (Lifelines population cohort and the Lifelines NEXT birth cohort in the Northern part of the Netherlands) and the Netherlands twin register (NTR) D1-Predicted self reported COVID-19, B2-Hospitalised COVID-19, C1-covid vs. lab/self-reported negative, C2- covid vs. population Northern, Western Europe and USA 1865 29174 Not specified 47.5 26% male Severe Nasopharyngeal swab / Whole blood Menni COVID-19 prediction model to find COVID-19 cases. GWAS performed on predicted COVID-19 case-controls. Top 20 SNPs from COVID-19 HGI D1 cohort were replicated in the C1, C2 and B1 analysis to compare predicted COVID-19 and other cohorts. Data generated with various methods from the four cohorts as follows: HumanCytoSNP Infinium (Global Screening Assay (GSA)/GSA MultiEthnic Disease Version); Perlegen-Affymetrix; Affymetrix (6.0/Axiom); Illumina (Human Quad Bead 660/Omni 1M GSA/OmniExpres Cohort data in SAIGE format was processed in WDL workflows made available at https://github.com/covid19-hg/META_ANALYSIS. Inverse variance weighting of effects was used to account for strand-differences and allele flips in individual studies. All build 37 statistics were upgraded to 38 build and allele harmonization was performed using gnomAD 3.0 genomes before beginning the meta-analysis. Generation Scotland- phasing using Shapeit v2.r873 and duohmm and imputation using the HRC.r1-1 panel. Helix-1000 Genomes Phase 3 data for imputation. Lifelines-Haplotype Reference Consortium (HRC) panel v1.1 at the Sanger imputation server. NTR- data was phased using Eagle and then imputed to 1000 Genomes and Topmed using Minimac 1. Predictive COVID-19 training data might not be fully representative of the whole spectrum of COVID-19, 2. Symptoms may overlap with other diseases, predicted cases may be falsely identified 3. Prevalence of COVID-19 might be different among different populations and cohorts
van der Made GRCh37 Radboud University Medical Center,Nijmegen, Netherlands EUR, AFR 2 pairs of brothers with no previous chronic disease presented in hospital with COVID-19 associated respiratory insufficiency. Pair 1 -Caucasian ancestry, required ICU and mechanical ventilation with one death. Pair 2- African ancestry requiring ICU and mechanical ventilation. North Western Europe 4 0 none 26 years median age Male Critical Nasopharyngeal swab Rapid whole-exome sequencing was performed. DNA samples were processed using the Human Core Exome Kit and extended RefSeq targets (Twist Biosciences). Librarieswere prepared according to the manufacturers’ protocols. All DNAsampleswere sheared using a Covaris R230 ultrasonicator (Covaris), subsequently followed by 2 × 150– base pair paired-end sequencing on a Novaseq 6000 instrument (Illumina). Downstream processing was performed using an automated data analysis pipeline that included Burrows- Wheeler Aligner mapping, Genome Analysis Toolkit variant calling, and custom-made annotation. Exome analysis of the affected brothers families was also performed to check segregation of all rare filtered variants in the respective index patients. - The case series precludes drawing conclusions regarding causality between the rare loss-of-function TLR7 variants and the pathogenesis of severe COVID-19. The functional experiments with IFN-? measurements lacked statistical significance, possibly due to the limited number of replications and controls included in this study.
van Moorsel GRCh38 Universities and hospital divisions in Netherlands, UK and Denmark EUR The discovery cohort from the ILD biobank and data registry of the St Antonius Hospital Nieuwegein, the Netherlands, included adult patients hospitalized due to COVID-19 at St Antonius Hospital between March 19, 2020 and May 5, 2020. 83 participants designated as White and 25 as non-White Netherlands 108 611 7% Diabetes, 15% Asthma/COPD, 1% Interstitial lung disease, 1% Pulmonary hypertension 66 69% male Severe Nasopharyngeal and Clinical characteristics DNA was extracted using a Chemagic 360 from whole blood and samples were genotyped for MUC5B rs35705950 with a pre-designed taqman SNP genotyping assay and the QuantStudio R 5 Real-Time PCR system. Validation cohorts included 436 UKB cases and 356799 UKB controls. For replication, summary data from the severe COVID-19 GWAS group was obtained. This included 835 cases and 1255 controls from Italy and 775 cases and 950 controls from Spain. Genotype counts for SNP rs35705950 were obtained from the r SPSS 24 was used for statistical analysis. Due to ethnic differences in the prevalence of the MUC5B rs35705950 alleles, genetic analyses were stratified by ethnicity and only statistically analyzed in white subjects. Differences between the allele and genotype frequencies were calculated with the Pearson’s goodness-of-fit chi-square test, together with the OR and 95% CI. Binary logistic regression was used to test for MUC5B rs35705950 association and COVID-19 with age and sex as confounding variables. A value of p < 0.05 was considered statistically significant. Metaanalyses were performed using the allele contrast and dominant model in the web tool META-Genyo. The fixed-effect estimate method, inverse variance was used. Genetic data from the “v3” release of UKBB was used which contained the full set of Haplotype Reference Consortium (HRC) and 1000 Genomes imputed variants. For the Italian cohort imputation was performed via TOPMed reference panel a. A limitation of the study is the focus on white European populations. Minor allele frequencies for MUC5B rs35705950 are known to differ between populations, b. Small sample size of the Dutch cohort, yielding a significant result but with a wide confidence interval.
Vietzen not specified Center for Virology, Medical University of Vienna and the Department of Medicine IV, Kaiser Franz Josef Hospital, Vienna, Austria EUR SARS-COV-2 positive cases obtained from the Center of Virology, Medical University of Vienna between the 17 February and 17 April 2020. A total of 92/361 (25.5%) patients showed only minor symptoms and stayed in home quarantine (“nonhospitalized”), 190/361 (52.6%) patients were hospitalized with severe COVID-19 symptoms but never required intensive care (“hospitalized non-ICU”), and 79/361 (21.9%) patients were severely affected and needed intensive care. Austria 361 260 Obesity, hypertension, COPD and CAD 69 years (median) 45% female Severe Nasopharyngeal swab DNA extraction was performed using the NucliSens EasyMag extractor (BioMérieux). DNA was eluted in 50 ?l of nuclease-free H2O. HLA-E*0101/0103 genotypes were determined by a Taqman assay and KLRC2wt/del variants were determined by touchdown PCR. As internal controls, genomic DNA obtained from the HeLa, HEK?293T, and K562 (all ATCC, Manassas, VA, USA) were used. Randomly chosen amplicons from all KLRC2 and HLA-E variants were routinely selected, sequenced on a 3130 genetic analyzer (Applied Biosy The distribution of the patient’s gender, comorbidities, and genetic variants was compared by ?2 test. Patient age was assessed by ANOVA and Dunn post test. Correlation of the genetic variants and comorbidities was assessed using ?2 test. For multivariable analysis, a general main effects loglinear model with genetic variables, gender, and age groups (<60, 60–70, 70–80, >80 years) was used to identify combined genetic variables associated with the risk for severe SARS-CoV-2 infections, who were hospitalized or hospitalized in an ICU. P values <0.05 were considered significant. Statistical analyses were performed using IBM SPSS Statistics 24. NA not specified
Wang GRCh38 Universities, Hospitals and Institutes in China EAS COVID-19 hospitalised patients from Shenzhen Third People’s Hospital, China. Of the recruited patients, 25 (7.5%), 12 (3.6%), 225 (67.8%), 53 (16.0%), and 17 (5.1%) patients were defined as asymptomatic, mild, moderate, severe, and critically ill, respectively. Chinese 332 966 >50% of patients with one comorbidity, not specified. Severe COVID category had 58.8% of patients with a comorbidity vs. mild at 45.1% not specified 135 male; 149 female Severe Nasopharyngeal swab "Deep whole genome sequencing (46x) was used to maximise statistical power due to small sample size with the DNBSEQ platform. Loss of function, rare and common variants were analysed. Loss of function and rare variants were assessed in related individuals and common variants were also analysed in the study cohort. Both single variant and gene-based GWAS werewere performed. Joint-calling of the genetic variants of the unrelated COVID-19 patients (n = 284) and the publicly available Chinese genome "Variation detection and genotyping performed using GATK joint genotyping framework. Sentieon Genomics software was used to perform genome alignment and variant detection. The analysis pipeline was bulit according to the Broad institute best practices workflows with variant calibration and filtration using GATK and variant prediction with Variant effect predictor software. PLINK and KING were used for kinship analysis with the Genesis R package for PCA used for genotype–phenotype association tests using the default parameters. Genome-wide significance for single variant association test as 5e–8, suggestive significance as 1e–5 and for gene-based association test as 1e–6." not specified a. Sample size of 332 is only just sufficient to identify genome-wide significant genetic variants with MAF greater than 0.2 and odds ratio greater than 1.8 given type I error rate 0.05. b. Patients recuited from hospital had limited information from asymptomatic individuals in comparison to the severe cases.
Wulandari not specified Universities based in Indonesia and the UK EAS Patients with moderate and severe COVID-19 (n = 62, 65.3%) were hospitalised in Dr Soetomo General Academic Hospital, Surabaya, Indonesia, whilst 33 patients (34.7%) with asymptomatic or mild symptoms were treated in Indrapura KOGABWILHAN II Hospital, Surabaya, Indonesia. Indonesia 95 none Diabetes(n=21), CVD (25), Liver disease (13), Kidney disease (5), Lung disease (3) 44.7 +/- 1.3 60 male 35 female Susceptibility Nasopharyngeal swab DNA extraction was performed using the QIAamp® Blood DNA Midi kit and DNA concentrations determined using a microvolume spectrophotometer. The TMPRSS2 polymorphism was detected using a TaqMan SNP genotyping assay. Genotyping was performed RT-PCR with VIC and FAM fluorescent reporters to indicate allelic discrimination. Statistical analyses were performed using the IBM SPSS Statistics Software ver. 23 (IBM Corp.) or GraphPad Prism ver. 8 (GraphPad Software, LLC). A chi-squared test was used to examine the Hardy–Weinberg equilibriums and to determine the association between categorical variables in the cross-tabulation data. ANOVA with post hoc multiple comparisons was used to analyse numerical data. A P value less than 0.05 was considered to be statistically significant. NA a. Small sample size with larger studies required for validation, b. A Ct value was used to determine viral load which can only provide an estimate, and c. The effect of the variant on protein function needs to be performed.
Zhang GRCh37 COVID Human Genetic Effort (HGE) not specified Various nationalities from Asia, Europe, Latin America, and the Middle East. Subjects enrolled in clinical trials across France, French Guiana and Italy. Patients with life-threatening COVID-19 pneumonia requiring ICU admission (Death in 13.9% associated with COVID-19). Control population were asymptomatic or developed mild disease. Not specified 659 534 Not specified 51.8± 15.9 25.5% female; 74.5% Critical Nasopharyngeal swab / Whole blood Whole exome and whole genome sequencing of subjects and controls using Illumina NovaSeq6000 system. Variants analysed from 13 gene loci known to affect type I IFN pathways. Predicted loss of function (LOF) mutations further assessed in vitro for expression and functionality. GATK was used to analyse WES. Read alignment analysed with Burrows–Wheeler Aligner software and Picard for QC. Variants curated using Integrative Genomics Viewer (IGV) and confirmed to affect the main functional protein isoform. HMZDelFinder and CANOES algorithms used to detect deletions. Logistic regression with the likelihood ratio test used to compare cases and controls with loss of function variants with PCA used to account for ethnic heterogeneity in PLINK 1.9 software. NA Variants were classified according to genotype. Of the 24 LOF variants indicated, supplementary data provided RSID numbers for 6 variants only with the HGVS allocations missing for others.
Zhu GRCh38 Universities, Hospitals and Institutes in China EAS COVID-19 respiratory disease and hospitalization cases (Wuhan Union Hospital) between January 15 and April 4, 2020 Chinese 466 0 288 individuals with at least one comorbidity including hypertension (N = 180, 38.63%), diabetes (N = 95, 20.38%), and coronary heart disease (N = 63, 13.52%) 23-97 years (20–39 (8.5%), 40–59 (31.1%), 60–79 (51.1%), and 80–99 (9.2%)) 237 female; 229 male Severe TBD Samples sequenced for genotyping using the DNBSEQ platform at a mean sequencing depth of 17.8x. GWAS aanlysis was performed for all laboratory traits to discover significant associations. The COVID-19 HGI round 5 meta-analysis results were used to study susceptibility and severity. One and two sample Mendelian randomization analyses was performed to determine causal effects between laboratory traits and diseases status. Gene set enrichment analysis was also performed. For genotyping, samples with a call rate of <0.99, closely related individuals identified by identity-by-descent (IBD >0.1) calculated in KING, and (iii) outliers identified by principal component analysis based on three-sigma rules were excluded from further analysis. Standard quality control criteria for genetic variants was applied by removing those with a SNP call rate <0.99, minor allele frequency (MAF) < 0.01, and Hardy-Weinberg equilibrium p value < 1E-06. PLINK v2.0 was used to perform single-variant GWAS analyses using a linear regression model for the quantitative laboratory features under the assumption of additive allelic effects of the SNP dosage. Age, sex, and the top six principal components (PCs) of genetic ancestry were normalized and the resulting residuals applied using a Z-score normal transformation. The number of PCs was chosen by using EIGENSTRAT software. A genome-wide significance threshold of 5E-08 and a study-wide significance threshold of 6.41E-10 (=5E-08/78 Imputation was performed with Beagle v4.0 taking GL as input in EAS population of 1,000 Genomes Project (1KGP) as reference panel. 1. Small sample size numbers and small genetic effect sizes resulted in no genome-wide signals associated with severe status, 2. Only one valid SNPs were associated with two genetic traits that cause disease eventhough these traits are known to ne polygeneic, 3. The genetic mechanisms that mediate COVID-19 traits require deeper investigation and were only briefly explored in this study, and 4. Further study into transcriptome and proteome- wide association should be included to uncover functiona