NAISS
SUPR
NAISS Projects
SUPR
Efficient ViT
Dnr:

NAISS 2026/4-576

Type:

NAISS Small

Principal Investigator:

Mehdi Babaeivavdare

Affiliation:

Chalmers tekniska högskola

Start Date:

2026-03-23

End Date:

2027-04-01

Primary Classification:

20208: Computer Vision and learning System (Computer Sciences aspects in 10207)

Allocation

Abstract

I am interested in conducting a research project focused on Vision Transformers (ViTs) within the field of image analysis and computer vision. The primary goal of this project is to explore efficient architectures and optimization techniques that improve the performance and scalability of ViT models, particularly in resource-constrained environments. While Vision Transformers have demonstrated strong performance compared to traditional convolutional neural networks, they often require significant computational power and large datasets. In this project, I aim to investigate methods for enhancing efficiency, such as model compression, knowledge distillation, token reduction strategies, and hybrid architectures that combine convolutional and transformer-based approaches. Additionally, I plan to utilize available computational resources effectively, including GPUs and optimized deep learning frameworks, to ensure practical and scalable implementation. The project will involve experimenting with benchmark image datasets, evaluating performance in terms of accuracy, computational cost, and memory usage. By focusing on efficiency without significantly compromising performance, this work seeks to contribute to making Vision Transformers more accessible for real-world applications, especially in edge devices and distributed systems. Ultimately, this research aligns with the broader goal of developing scalable and resource-aware intelligent vision systems.