SUPR
New Gold Standard Dataset resource for Swedish NER
Dnr:

NAISS 2023/5-424

Type:

NAISS Medium Compute

Principal Investigator:

Dana Dannélls

Affiliation:

Göteborgs universitet

Start Date:

2023-10-30

End Date:

2024-11-01

Primary Classification:

10208: Language Technology (Computational Linguistics)

Webpage:

Allocation

Abstract

We aim to study the performance of various data augmentation methods on the Swedish language, and also the combinations of the different methods. This will be done on the Swe-NERC Version 1 dataset. In trying to achieve this goal we will (if a successful result) create an updated new golden dataset resource for Spårkbanken Text, Swe-NERC Version 2. The primary use of the dataset is to train machine learning models for named entity recognition tasks.