SUPR
A Domain-Specific Compilation Framework for FFT Code Generation on Heterogeneous Hardware Using MLIR and Machine Learning Techniques
Dnr:

NAISS 2024/22-1313

Type:

NAISS Small Compute

Principal Investigator:

Yifei He

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2024-11-01

End Date:

2025-11-01

Primary Classification:

10201: Computer Sciences

Webpage:

Allocation

Abstract

Fast Fourier Transform (FFT) libraries are essential components of any High-Performance Computing (HPC) software stack, underpinning a wide range of applications from Partial Differential Equation (PDE) solvers to signal spectral analysis and deep learning. To design and develop the next generation of HPC FFT libraries, it is crucial to harness modern compiler infrastructures like Multi-Level Intermediate Representation (MLIR) and the Low-Level Virtual Machine (LLVM). Additionally, integrating advanced machine learning techniques into compiler optimizations can enhance decision-making and performance. In this project, we propose the development of a next-generation portable FFT library. This library will utilize an MLIR-based domain-specific compilation framework to generate high-performance FFT code across various hardware backends, such as x86 and ARM CPUs, and GPUs. Our approach employs the MLIR and LLVM stack for a multi-level, progressive lowering code generation pipeline. We will use the Linalg dialect as a high-level abstraction for FFT, providing reusable tensor-based abstractions. The library will incorporate formula rewriting for FFT decomposition and cache-friendly optimizations. Low-level code generation will be managed through the MLIR vector and GPU dialects, ultimately generating LLVM IR to ensure efficient execution on different hardware platforms. Additionally, we will employ machine learning methods instead of traditional compiler heuristics for auto-tuning and Design Space Exploration in FFT code generation plans. This approach aims to optimize performance by exploring various code generation strategies and selecting the best-performing configurations. By integrating these advanced compilation techniques and modern hardware features, our proposed FFT library aims to achieve superior performance and portability, addressing the growing demands of HPC applications.