NAISS
SUPR
NAISS Projects
SUPR
Large Language Models for Mathematical Discovery
Dnr:

NAISS 2025/22-303

Type:

NAISS Small Compute

Principal Investigator:

Solrun Halla Einarsdottir

Affiliation:

Chalmers tekniska högskola

Start Date:

2025-03-06

End Date:

2026-01-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

Allocation

Abstract

The goals of this project are: 1) To create a prototype system for integrating a Large Language Model with a proof-assistant, for the purpose of generating new interesting mathematical conjectures. 2) To perfom experiments that compare the results achievable using purely neural conjecturing, purely symbolic conjecturing, and several neuro-symbolic combination approaches. Generating useful and novel lemmas is a challenging task for LLMs, as well as for previous symbolic approaches. LLMs are remarkably good at learning patterns from their training data and generating output that fits a similar pattern for a given query context. Therefore, they can potentially be trained to generate lemmas similar to those previously seen for mathematical definitions analogous to those given, if exposed to the right kind of training data. A weakness of neural models such as LLMs is that they may be prone to generating repetitive or redundant lemmas and fail to discover more novel and useful lemmas. Another flaw that must be addressed when using LLMs in this context is the fact that there are no correctness guarantees on the LLM’s output, so the generated lemmas may simply be false. Both of these challenges have been encountered in previous work on neural conjecturing. Unlike neural methods, symbolic methods can be designed and programmed to generate only true statements and avoid repetition and redundancy. However, symbolic methods will only generate lemmas that fit a predefined specification from within a specified search space, and tend to scale poorly to a larger search space. To address these shortcomings, we propose a neuro-symbolic lemma conjecturing tool with the following implementation: An LLM is trained to generate lemma templates that describe the shape of a lemma rather than generating complete lemmas. Then symbolic methods are used to fill in the details. In this way, we leverage the best of both neural and symbolic methods, using the LLM to capture the intuition and suggest appropriate patterns and symbolic methods to ensure correctness and novelty. As far as we are aware, this is the first work focusing on neuro-symbolic conjecturing of novel lemmas. With the great success LLMs have displayed in generative tasks, it is crucial to examine the potential ways to use them in combination with reliable symbolic methods for optimal results and efficiency.