Accelerating Scientific Applications with OpenMP for GPUs

Description

2 x 1.5 hours including hands-on.

Target audience:
Part 1: attendees with some HPC background but little or no prior experience with OpenMP GPU offload.
Part 2: attendees with prior exposure to OpenMP offload, including participants from Part 1.

OpenMP has become a valuable option for porting scientific applications to GPUs: what makes it attractive are portability, relative simplicity and a quick ramp up time compared to other GPU programming approaches. In this tutorial, we will show how to implement OpenMP offloading in practice for C/C++ and Fortran applications.
This hands-on tutorial is organized into two parts:

First, we introduce the fundamentals of OpenMP offload for GPUs and discuss how unified shared memory on systems such as the AMD Instinct MI300A can simplify the transition of your codebase from CPU to GPU. Participants will explore these concepts through practical exercises in C/C++ and Fortran, with a particular focus on how OpenMP can support incremental porting of existing CPU HPC applications.
In the second part, we move towards intermediate and advanced topics, focusing on performance-oriented OpenMP programming: using application examples, we will discuss common performance pitfalls, how to address them and strategies for improving GPU utilization in real codes. We will also compare OpenMP target offload with HIP, a lower-level GPU programming model from AMD. With this information, attendees will be empowered to make a well-informed decision on the choice of programming model for their own CPU to GPU porting.

Attendees will be allowed to perform the hands-on portion of the tutorial on an AMD Instinct MI300A system that features different ROCm versions as well as the latest Fortran compilers from AMD.