SUPR
Causal Foundation Models for Tabular Data
Dnr:

NAISS 2024/22-984

Type:

NAISS Small Compute

Principal Investigator:

Ruibo Tu

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2024-08-06

End Date:

2025-09-01

Primary Classification:

10201: Computer Sciences

Webpage:

Allocation

Abstract

Understanding and reasoning with tabular data to provide meaningful insights and support decision-making has been a common and challenging task in various industry domains such as business intelligence, data science, and data engineering. With the rise of foundation models like GPT-4 and ChatGPT, we have witnessed their astonishing performance, especially in natural language tasks. However, most foundation models are based on image and text data, and there is a notable gap in research concerning the application of foundation models for tabular data, which holds promising applications in both industry and academia. This project aims to explore the ability of foundation models to understand and reason with tabular data, and to support decision-making by leveraging current causal machine learning methods and large language models. We plan to train causal-aware foundation models on tabular data, enabling them to reason with causal graphs and predict the consequences of actions and interventions for decision-making. Moreover, we will investigate the interaction between causal tabular foundation models and available pre-trained language foundation models, which can provide valuable knowledge and common sense for understanding, reasoning, and decision-making with tabular data. Recently, based on the exploration, we found that there is a significant lack of benchmarking works for better understanding foundation models in the tabular domain. Therefore, we expand the scope and start benchmarking foundation models on their capabilities for capturing causal knowledge.