Assessment of different BLAS/LAPACK implementations on AMD EPYC Rome processors

Student Research Project / Master Thesis

Contact: Björn Dick <>


Doing computer simulations of real-world processes often requires to do a lot of (basic) linear algebra operations. Hence, processor manufacturers typically provide highly optimized libraries that can do those operations in an efficient manner. It’s nevertheless possible to use different implementations on a given processor. The idea of this project is to compare and assess different implementations of those libraries on HLRS’ current supercomputer “Hawk” (5632 compute nodes with 2 x 64 AMD EPYC Rome cores each) with respect to runtime and energy efficiency.


  1. Identifying relevant routines and input datasets based on production jobs of HLRS customers.
  2. Compiling, running and profiling compute jobs with different BLAS/LAPACK implementations for the routines and input datasets identified before.
  3. Assessing results and deducing recommendations.


  • Basic knowledge of linear algebra and basic understanding of the routines implemented in BLAS/LAPACK.
  • Strong command in Linux-based environments, in particular building codes with GNU Autotools/make and CMake.
  • Ideally initial experience in the usage of Score-P, Cube and maybe Scalasca
  • Ideally initial experience in High-Performance Computing environments