Introduction to use Python in the HPC¶
In this workshop, we will explore the process of improving Python code for efficient execution. Chances are, you're already familiar with Python and Numpy. However, we will start by mastering profiling and efficient NumPy usage as these are crucial steps before venturing into parallelization. Once your code is fine-tuned with Numpy we will explore the utilization of Python's parallel libraries to unlock the potential of using multiple CPU cores. By the end, you will be well equipped to harness Python's potential for high-performance tasks on the HPC infrastructure.
Target Audience Description¶
The workshop is designed for individuals who are interested in advancing their skills and knowledge in Python-based scientific and data computing. The ideal participants would typically possess basic to intermediate Python and Numpy skills, along with some familiarity with parallel programming. This workshop will give a good starting point to leverage the usage of the HPC computing power to speed up your Python programs.
Agenda¶
First day: Using Jupyter notebook on HPC infrastructure, profiling and using Numpy effectively¶
- Setting up a Jupyter notebook on an HPC node
- Taking time and profiling python code
- Numpy basics for replacing python loops for efficient computations
Second day: Improving performance with python parallel packages¶
- Use case understanding and Python implementation
- Numpy implementation
- Python’s Multiprocessing
- PyMP
- Cython
- Numba and final remarks
Requirements¶
- Having an HPC account to access the cluster.
- Basic knowledge on SLURM.
- A basic understanding of Python programming.
- Familiarity with Jupyter Notebook (installed and configured).
- A basic understanding of Numpy and linear algebra.
- Familiarity with parallel programming.