SUPR
Efficient text modelling and recognition at scale
Dnr:

NAISS 2023/22-1160

Type:

NAISS Small Compute

Principal Investigator:

Ekta Vats

Affiliation:

Uppsala universitet

Start Date:

2023-11-01

End Date:

2024-11-01

Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)

Allocation

Abstract

This project focuses on handwritten text modelling and recognition tasks, and investigates the integration of language models into the HTR pipeline. This will be performed in an explorative way, by testing different language architectures and assessing the performance through relevant metrics such as WER, CER, and Accuracy. This work builds upon the previous work on Attention HTR. Different language models such as Skip-grams, Candidate Fusion, and others, will be investigated. They will be implemented at various layers of the AttentionHTR architecture, in order to find the best combination of integrated model. Error-analysis will be performed, in order to assess the weaknesses and strengths of the integrated model. In order to supplement the training data, Denoising Diffusion Probabilistic Model will also be explored for realistic synthetic image generation of handwritten text images.