Performance engineering of software applications

Brief summary of the course

Best practice for performance engineering with C/C++ and Python. Topics include measuring performance, profiling in C/C++ and Python, extending Python with C/C++, application development with GPGPU (NVIDIA CUDA).

Course topics

Single thread programming (12 hours):

Topic Hours
1. Performance engineering basics. When we need it. 2
2. Measuring performance, profiling in C/C++ and Python. 2
3. Libraries in C/C++. Hacker delights (bit-hacks). C to the assembler. 2
4. Extending Python with C/C++  2
5. Python + Pandas, Numpy, and C/C++ libraries data exchange. 2
6. Cache efficient algorithms. Memory issues.  2

 

Multicore programming (16 hours) :

 

Topic Hours
1. Python parallel applications. Python global interpreter lock.  2
2. Multithreading and multiprocessing in Python and C/C++  4
3. Synchronization and locks 2
4. NVIDIA CUDA programming for C and Python 6
5. OpenCL 2

 

Distributed programming (4 hours).

 

Topic Hours
1. Scaling applications with multiple workers 2
2. Distributed networking algorithms 2

 

Homework assignments:

 

  1. Homework #1 – performance engineering for the single thread application.
  2. Homework #2 – Development of the multi-thread application.
  3. Homework #3 – Development of the application based on GPGPU
  4. Homework #4 – Development of the network scaled application.

Prerequisites

  1. Basic understanding of computer hardware.
  2. Basic understanding of networking.
  3. Basic software programming skills C/C++ and Python.
  4. Basic OS knowledge.
  5. Linux/Unix knowledge, bash scripting understanding.