UK Biobank | NPBBD-Korea | PRECISE | BBJ | All of Us | |
---|---|---|---|---|---|
Variant calling | GATK DRAGEN (FPGA-accelerated) | GATK | GATK | GATK | DRAGEN GATK DeepVariant (deep learning-based precision) |
Multi-sample VCF | GATK (GenotypeGVCFs) DRAGEN (DRAGEN Iterative gVCF Genotyper for scalability) Graphtyper | GATK (GenotypeGVCFs) | GATK (GenotypeGVCFs) | GATK (GenotypeGVCFs) Graphtyper | Genomic Variant Store (GATK based) Glnexus |
Data representation & storage | BAM/CRAM Sparse VCF | BAM gVCF | BAM/CRAM Sparse VCF | BAM/CRAM Dense VCF | BAM/CRAM Sparse VCF (Hail matrix, VDS) |
Computing environment | Cloud-based RAP with DNAnexus and AWS | KISTI National Supercomputing Center (https://www.ksc.re.kr/eng/index/main) | RAPTOR (Research Assets Provisioning and Tracking Online Repository) | Local HPC for server-based analysis | Cloud-based workbench (Google Cloud Platform for large-scale analysis) |
Data management system | “Category-field”-based data structure | DRC and RDR-CDR system | - | - | GIMS |
Data access system | Tier system, paid for all tiers | Tier system, free for all tiers | Tier system, free for all tiers | Tier system, free for all tiers | Tier system, free for all tiers |