Skip to main content
Fig. 3 | Genomics & Informatics

Fig. 3

From: A prediction of mutations in infectious viruses using artificial intelligence

Fig. 3

Data composition for accurate mutation prediction in the RBM region of SARS-CoV-2. A Model performance when trained with a dataset created using random state from clades collected from December 23, 2019, to May 22, 2023. B Pre-training of accurate mutations using the pandemic wave periods, mutations found in the RBM, and the regional information of the collected samples. Models trained to predict wave 3 from wild-type waves 1 and 2; wave 4 from wild-type waves 1, 2, and 3; and wave 4 (delta) from wild-type waves 1 and 2. C Frequency of mutation locations that may occur in the next wave predicted by the XGBoost model (waves 1, 2 −  > wave 3: ①, waves 1, 2, 3 −  > wave 4: ②, waves 1, 2 −  > wave 4: ③). D For current Omicron variants including wave 5, mutations in specific RBM regions become fixed. Red indicates higher frequency of mutations at that position, while white indicates lower frequency (23H: n = 5830, 23I: n = 7452, 24A: n = 86,318, 24B: n = 9922)

Back to article page