SUPR
Importance-Aware Dataset Partitioning for Data-Parallel Training of Deep Neural Networks
Dnr:

NAISS 2023/22-703

Type:

NAISS Small Compute

Principal Investigator:

Sina Sheikholeslami

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2023-06-26

End Date:

2024-06-01

Primary Classification:

20206: Computer Systems

Webpage:

Allocation

Abstract

It is known that not all examples within a training dataset are of "equal importance"; i.e., the model finds some of them harder or easier to learn. In other words, different examples contribute differently to the training process and the performance of the trained model. Prior works have used notions of "importance" or "hardness", e.g., to find a subset of the ImageNet dataset that train better models compared to the full dataset, or in active learning for choosing the best examples to label. Thus, we want to study the effects of importance-based partitioning of the dataset examples across workers, using several heuristics, on the performance of data parallel training of deep neural networks. The essential question is: can we speed up data parallel training by partitioning datasets in ways other than naive random partitioning? this results in faster training and experimentation times and better and more efficient resource utilization of GPU clusters, both on-premise and on the cloud.