The goal of the RegTek project is to conduct a pilot study to evaluate the accuracy and usability of a RAG for AI regulatory impact assessments, which may lead to wider acceptance and trust in language models and more comprehensive and fair regulation. The work includes exploring how metrology, specifically, the Rasch measurement method can be used to evaluate/measure the abilities and performance of AI language models such as accuracy and precision.
We will use the resources to build a generic pipeline to evaluate RAGs using open source LLMs.