Computer Architecture Group.

PLUTON CLUSTER

Page last modified on July 31, 2023.

Description

Pluton is a heterogeneous cluster intended for running High Performance Computing applications which has been co-funded by the Ministry of Economy and Competitiveness of Spain and by the EU through the European Regional Development Fund (project UNLC10-1E-728).

The cluster is managed by the Computer Architecture Group and is currently hosted at CITIC, a research centre with the participation of the University of A Coruña (Spain).

Hardware Configuration

Since the initial deployment in June 2013, the cluster has received several small hardware updates supported by newer research projects. As of July 2023, the cluster consists of:

+ A head node (or front-end node), which serves as an access point where users log in to interact with the cluster. The head node of Pluton can be accessed from Internet through ssh at pluton.dec.udc.es. The hardware of this node was upgraded in September 2019, currently providing up to 12 TiB of global NAS-like storage space for users. Moreover, it is interconnected with the computing nodes through Gigabit Ethernet and InfiniBand FDR.

+ 30 computing nodes, where all the computation is actually performed. These nodes provide all the computational resources (CPUs, GPUs, memory, disks) required for running the applications, with an aggregate computing capacity of up to 720 physical cores (1440 logical threads), 4.4 TiB of memory and 17 NVIDIA Tesla GPUs. All computing nodes are interconnected via Gigabit Ethernet and InfiniBand FDR (56 Gbps) networks.

Software Environment

The cluster runs Rocks 7 distribution based upon CentOS 7 (v7.9.2009), which is a free, community-supported GNU/Linux OS that is binary compatible with Red Hat Enterprise Linux (RHEL). Furthermore, Pluton relies on Slurm Workload Manager v19.05.2 as job scheduler and Lmod v8.1.18 for modules management. Additionally, the OpenHPC repository is used to manage and install some of the avaialable software.

Other relevant sofware available is:

+ Video driver v460.106/v525.85 for NVIDIA Tesla GPUs
+ NVIDIA CUDA Toolkit (v9.2/v10.1/v10.2/v11.0/v11.8)
+ Intel compilers and libraries (Parallel Studio XE 2019 and 2017)
+ GNU compilers (v7.3.0/v8.3.0/v9.3.0/v11.2.1)
+ MPI libraries (MPICH/MVAPICH2/Open MPI/Intel MPI)
+ Intel OpenCL support (SDK 2019 and CPU Runtime 18.1)
+ Linux containers with udocker (v1.3.1) and Apptainer/Singularity (v1.2.4)
+ Python (v2.x/v3.x)
+ Java Development Kit (JDK 6/7/8/11/13/14/15/17)

User Guide

The user guide aims to provide the minimum information that is necessary for a new user in the system. Basically, it describes in detail the cluster and its hardware/software configuration, explains the file systems that are available for users and provides basic examples of how to run different types of applications using Slurm. The guide assumes that users are familiar with the most common utilities for GNU/Linux platforms.

Here you can download the latest version of the user guide. Do not hesitate to contact the administrator if you have any question (see Contact).