HPC in Astrophysics and Astronomy

Facilitator: Mladen Ivkovic

Speakers: Sarah Johnston, Abouzied Nasar, Will Roper, Zhen Xiang, Romeel Dave, Britton Smith, Kyle Oman, Dmitry Nikolaenko

Register to attend online

Description

HPC is an indispensable discipline in astrophysics and cosmology with many users running simulations, solving intricate networks of nonlinear equations, post-processing and analysing data, and synthesising images, spectra, and mock observations on a daily basis. Despite its crucial importance to the field, the presentation and discussion of our HPC work often falls short during astronomy and astrophysics conferences, which tend to focus on physics outputs and outcomes. This session serves as a platform to exchange and highlight all the underlying HPC work, both current an in progress, as well as challenges in the astrophysics and astronomy communities.

Session Part 1

Making Gravity SWIFTer: GPU offloads for gravity in the SWIFT cosmology code • Sarah Johnston

To fully utilise modern heterogeneous HPC systems and improve power efficiency, large astronomy codes must become GPU compatible. SWIFT is a versatile open-source cosmology code which is used to model a variety of astrophysical scenarios. It utilises task-based parallelism, dividing the workload into independent tasks managed by a scheduler for efficient CPU utilisation, and is highly optimised for use on memory-intensive, CPU-only clusters. A significant portion of SWIFT’s runtime is dedicated to gravity calculations. However, the repetitive and non-interdependent nature of these interactions makes them ideal candidates for GPU acceleration. In this work, we build on the existing SWIFT code by replacing CPU-based gravity with new GPU kernels, while minimising changes to the rest of the code. The GPU-accelerated kernels achieve high accuracy, with <1% deviation from the CPU results. However, we are limited by a memory transfer bottleneck in moving data between the CPU and GPU. To mitigate this, we have employed a novel ‘task-bundling’ system which allows us to group tasks in the scheduler to give higher occupancy on the GPU. This provides more work, reducing the overhead from the CPU-GPU memory transfers. To further exploit the GPU, we explore a redistribution of the gravity calculations meaning more interactions can be carried out using direct particle-particle summations, rather than multipole-based approximations, which gives us more accurate results and provides more work for the GPU. We also investigate kernel optimisations like the use of multiple streams and tiled memory access.

Exascale SWIFT hydrodynamics, are we there yet? • Abouzied Nasar

With the recent heavy lean towards heterogeneous solutions for supercomputing, the astrophysics solver SWIFT is being adapted to leverage GPUs and accelerate its computationally intense workloads such that it retains its world leading position as one of the most scalable and efficient publicy available solvers in the UK’s exascale era of supercomputing. SWIFT requires a plethora of solvers for different physics in order to create digital twins of the Universe and capture other smaller scale cosmological phenomena. The two most computationally demanding solvers are SWIFT’s hydrodynamics and gravity solvers. We therefore focus on these solvers initially with a view on extending the hard won benefits of GPU acceleration to other, less computationally demanding physics down the line. This talk focuses on the current state of GPU acceleration for the hydrodynamics solver within SWIFT’s task-parallel framework. We demonstrate significant speed ups in comparison to CPU-only baselines rivalling other world-leading GPU accelerated astrophysics solvers; GPU-accelerated SWIFT achieves 22 million particle time step updates per GPU per wall clock second in comparison to 8 million for the CPU code executed on a full 72 core CPU. We also present near-perfect computational scaling for the GPU accelerated solver and greater than 30% energy consumption savings in comparison to the CPU only baselines.

Swift-Zoom: Efficient and scalable zoom simulations • Will Roper

Zoom simulations are becoming an ever more important tool for investigating the Universe. Whether pushing volumes and resolution to the extreme to probe the rarest regions of the Universe, or sampling highly dimensional parameter spaces with numerous individual simulations, cosmological simulations are becoming increasingly costly, with many studies forced to compromise due to this expense. Zoom simulations alleviate this problem by focusing computational effort on the regions of interest for a particular problem, while degrading the resolution elsewhere, a simple proposition that makes the intractable tractable. However, these simulations are inherently unbalanced, and few codes (if any) have fully optimised their computations for this configuration. In this talk, I will present how we have optimised SWIFT for exactly this use case.

Simulating the Epoch of Reionization: Physical Complexity and Computational Challenges • Zhen Xiang

The Epoch of Reionization is the period when the first galaxies ionized the intergalactic medium. Simulating this process is challenging because it involves the interaction between radiation and gas across a wide range of scales. In this talk, I will give a simple overview of these challenges and show how they appear in practice in cosmological simulations. Using examples from my work with Kiara-RT, a new radiation-hydrodynamics simulation suite, I will focus on the thermal evolution of gas and discuss how numerical choices affect the results. I will conclude with some open questions and future directions.

Session Part 2

HPC Challenges in Next-Generation Galaxy Formation Simulations • Romeel Dave

I will give a brief overview of the current landscape in galaxy formation simulations, with a focus on our upcoming Kiara simulation suite. The increasingly physics-rich and multi-scale modelling of a wide range of physical processes presents new challenges for HPC efficiency and scalability. I will highlight some of the performance bottlenecks associated with Kiara, and identify possible ways forward including machine learning-based approaches and the use of GPUs.

The Grackle Chemistry and Cooling Library: What Happens Next • Britton Smith

Chemistry and radiative cooling are together a critical component of galaxy simulations. Even employed with limited sophistication it is a significant and often dominant portion of a simulation's computational cost. As well, it represents a nearly infinitely hungry endeavor in terms of both human and computational resources. The Grackle project has provided an open-source library for chemistry and cooling with a stable API that has been used in more than a dozen simulation codes and hundreds of publications for over a decade. In this talk, I will give a brief overview of the project and the functionality it provides and present ongoing development efforts toward new features and improved performance that will enable Grackle to support the next generation of astrophysical simulations.

High-level tools for galaxy simulations analysis • Kyle Oman

Galaxy formation simulations produce large outputs with information-rich metadata catalogues. Some analysis tasks can be completed with the metadata alone, others require small, sparsely distributed subsamples of the full outputs, while others still require full datasets that may not fit in memory. Such diverse access patterns pose challenges for data storage and software tools for data access. I will give an overview of how we have addressed some of these in the swiftsimio and swiftgalaxy analysis packages, and highlight others that remain.

Peano and load balancing • Dmitry Nikolaenko

Peano is an open-source framework for dynamically adaptive Cartesian meshes derived from spacetrees. It ties mesh traversal to mesh storage through space-filling curves (SFC), enabling scalable simulations on shared- and distributed-memory supercomputers. The framework underpins several scientific codes, including ExaHyPE 2 (hyperbolic PDE solvers) and Swift 2 (particle methods), and targets modern HPC platforms ranging from multicore workstations to GPU-accelerated supercomputer nodes. At the heart of Peano lies a hierarchical spacetree that is constructed level by level during the simulation set-up. Each node of this tree represents a spatial cell; leaves carry the actual computational work. As the mesh grows, the distribution of leaves across MPI ranks and CPU threads becomes uneven, leading to idle processors and wasted wall-clock time. Achieving a balanced partition is therefore critical for strong scaling. Peano's load balancing toolbox provides a suite of strategies — from simple spread-out schemes to recursive bipartition and cascade combinators — each triggering splitting calls at the right moment to redistribute subtrees across compute resources. A key challenge is that splitting decisions must be made online, without knowledge of the final mesh structure, while the cost of moving data grows with the current tree size. In this talk we present an offline load balancing approach that transforms this online problem: the mesh is first constructed without any decomposition and checkpointed; an offline analysis tool then inspects the complete spacetree, derives an optimal multi-level splitting sequence, and encodes it as a hardcoded strategy for a warm restart. We describe the Python API that automates this workflow — including leaf distribution, bottom-up topology propagation, and a planned hierarchical optimisation stage that minimises splitting cost whilst preserving load balance — and share lessons learned from applying it to black-hole simulations with the CCZ4 solver in ExaHyPE 2.