This resource is only intended for AI/ML research.
This resource is only intended for research on AI/ML or research using AI/ML methods.
The Alvis cluster is a national NAISS resource dedicated to Artificial Intelligence and
Machine Learning research.
Note: Significant generation of training data is expected to be done elsewhere.
The system is built around Graphical Processing Units
(GPUs) accelerator cards. The first phase of the resource has 160 NVIDIA T4, 44
V100, and 4 A100 GPUs. The second phase is based on 340 NVIDIA A40 and 336
A100 GPUs.
Tetralith is a general computational resource hosted by NSC at Linköping University.
Tetralith servers have two Intel Xeon Gold 6130 processors, providing 32 cores per server. 1844 of the servers are equipped with 96 GiB of primary memory and 64 servers with 384 GiB. All servers are interconnected with a 100 Gbit/s Intel Omni- Path network which is also used to connect the existing storage. Each server has a local SSD disk for ephemeral storage (approx. 200GiB per thin node, 900GiB per fat node). An IBM Spectrum Scale system comprises the centre storage. 170 of the Tetralith nodes are equipped with one NVIDIA Tesla T4 GPU each as well as a high- performance NVMe SSD scratch disk of 2TB.
Dardel is a Cray EX system from Hewlett Packard Enterprise, based on AMD EPYC processors with an accompanying Lustre storage system.
The nodes are interconnected using Slingshot HPC Ethernet.
GPU nodes on Dardel will probably be generally available 2023-01-01, but there is a risk for delays due to server maintenance to accomodate the GPUs.
Also, these GPUs are not nVIDIA GPUs but rather AMD GPUs, so if your software runs using CUDA, a certain amount of conversion of the code is needed.
You can read information about this at https://www.lumi-supercomputer.eu/preparing-codes-for-lumi-converting-cuda-applications-to-hip/
Dardel-GPU is the accelerated partition based on AMD’s Instinct MI250X GPU of the Cray EX system from Hewlett Packard Enterprise. It has an accompanying Lustre storage system.
The nodes are interconnected using Slingshot HPC Ethernet.
Click above to show more information about the resource.