NAISS
SUPR
NAISS Projects
SUPR
Digitizing a national longitudinal collection of student essays
Dnr:

NAISS 2026/4-110

Type:

NAISS Small

Principal Investigator:

Nils Kirsten

Affiliation:

Uppsala universitet

Start Date:

2026-01-22

End Date:

2026-04-01

Primary Classification:

50301: Pedagogy

Webpage:

Allocation

Abstract

A unique and nationally representative dataset of student writing in Sweden, spanning more than 20 consecutive years, is available in a national student essay archive. However, this is not available for quantitative analysis because of the extensive work required to transcribe and assess a sufficiently large sample of student essays with adequate reliability. The goal of this pilot project is to investigate the feasibility of digitization using NLP techniques or machine learning models. The pilot project will train and test a model for handwritten text recognition (HTR), adapted to handwritten texts produced by Swedish 15-year-old students. This will enable estimation of the reliability and efficiency of digitizing a large-scale corpus of nationally representative student essays.