SUPR
HTR model for Historical data
Dnr:

NAISS 2024/22-1420

Type:

NAISS Small Compute

Principal Investigator:

Dana Dannélls

Affiliation:

Göteborgs universitet

Start Date:

2024-11-05

End Date:

2025-11-01

Primary Classification:

10208: Language Technology (Computational Linguistics)

Webpage:

Allocation

Abstract

Handwritten Text Recognition (HTR) processes the image including entire lines or words and requires developing a specific model to decode these data with higher variability according to handwriting styles, texts records years and document types (Aguilar & Jolivet, 2023). In this project, the following objectives would be addressed: - Apply HTR technology into the handwritten texts extraction by the training model that is required to learn the handwriting styles, layouts and related information. - Output the recognized text in its positional information (texts located row and column in form), i.e., a similar format with machine readable texts as original form.