Skip to content

Introduction to GPU programming using CUDA

Participants from this course will learn GPU programming using the CUDA programming model, such as synchronisation, memory allocation and device and host calls. Furthermore, understanding the GPU architecture and how parallel threads blocks are used to parallelise the computational task. Moreover, GPU is an accelerator; hence, there must be a good understanding of memory management between the GPU and CPU, which will also be discussed in detail. Finally, participants will also learn to use the CUDA programming model to accelerate linear algebra (routines) and iterative solvers on the GPU. Participants will learn theories first and implement the CUDA programming model with mentors’ guidance later in the hands-on tutorial part.

Learning outcomes

After this course, participants will be able to:
  • Understanding the GPU architecture (and also the difference between GPU and CPU)
    • Streaming architecture
    • Threads blocks
  • Implement CUDA programming model
    • Programming structure
    • Device calls (threads block organisation)
    • Host calls
  • Efficient handling of memory management
    • Host to Device
    • Unified memory
  • Apply the CUDA programming knowledge to accelerate examples from science and engineering
    • Iterative solvers from science and engineering
    • Matrix multiplication, vector addition, etc


Priority will be given to users with good experience with C/C++. No GPU programming knowledge is required. However, knowing some basic parallel programming concepts are advantage but not necessary.

GPU Compute Resource

Participants attending the event will be given access to the MeluXina supercomputer during the session. To learn more about MeluXina, please consult the Meluxina overview and the MeluXina – Getting Started Guide.

Last update: April 30, 2024 12:00:25
Created: March 11, 2023 20:16:27