SUPR
New memory allocation method for transformer
Dnr:

NAISS 2024/6-189

Type:

NAISS Medium Storage

Principal Investigator:

Yuan Yao

Affiliation:

Uppsala universitet

Start Date:

2024-06-28

End Date:

2025-01-01

Primary Classification:

10206: Computer Engineering

Webpage:

Allocation

Abstract

In recent years, transformer models have become the cornerstone of advancements in natural language processing, machine translation, and various other domains of artificial intelligence. These models rely heavily on self-attention mechanisms that require substantial computational resources, particularly in terms of memory. The traditional methods of memory allocation, designed for less complex neural network architectures, are increasingly proving inadequate for handling the large-scale requirements of transformers. This inadequacy stems primarily from the quadratic growth in memory consumption relative to the input sequence length in transformers, which severely limits their scalability and applicability to longer sequences. A new memory allocation method is thus critically needed to overcome these limitations.