Benchmark for Evaluation of Danish Clinical Word Embeddings

Martin Sundahl Laursen*, Jannik Skyttegaard Pedersen*, Pernille Vinholt, Rasmus Søgaard Hansen, Thiusius R. Savarimuthu

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

1 Downloads (Pure)

Abstract

In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.
Translated title of the contributionBenchmark til Evaluering af Danske Kliniske Ordvektorer
Original languageEnglish
JournalNorthern European Journal of Language Technology
Volume9
Issue number1
Number of pages15
ISSN2000-1533
DOIs
Publication statusPublished - 21. Feb 2023

Cite this