Scheduling ML pipelines in Multi-Instance-GPU environment

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2023/22-918

Type:

NAISS Small Compute

Principal Investigator:

Pirah Noor Soomro

Affiliation:

Chalmers tekniska högskola

Start Date:

2023-09-18

End Date:

2024-10-01

Primary Classification:

20206: Computer Systems

Webpage:

Allocation

Alvis at C3SE: 1000 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

Machine learning inference phase demands maximisation of throughput when used as a service. As there are variable number of layers with varied weights in a neural network, an efficient way of scheduling the layers as parallel pipelines is required. We will profile and test the performance of ML in Multi-Instance-GPU environment to extract insights for possible pipeline scheduling on GPU servers.