Skip to main content

Table 5 Code mapping evaluation of GPT models, comparing codes as extracted vs mapped by the LLM

From: Towards automated phenotype definition extraction using large language models

Model

Metric

Average %

Minimum %

Maximum %

GPT 4

Extracted codes overlap

50.94

20.00

89.00

 

Mapped codes overlap

72.89

28.98

97.00

GPT 3.5

Extracted codes overlap

27.51

10.00

85.20

 

Mapped codes overlap

58.15

19.87

62.20