Michel Steuwer, University of Edinburgh

Abstract:

Computers have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort resulting in a tension between performance and code portability. Typically, code is either tuned in a low-level imperative language using hardware-specific optimizations to achieve maximum performance or is written in a high-level, possibly functional, language to achieve portability at the expense of performance.

In this talk, I will present our novel approach aiming to combine high-level programming, code portability, and high-performance. Starting from a high-level functional expression we apply a simple set of provably correct rewrite rules to transform it into a low-level  functional representation, close to the OpenCL programming model, from which eventually OpenCL code is generated. Our rewrite rules define a space of possible implementations which we automatically explore to  generate hardware-specific OpenCL implementations.

I will show experimental results of the high performance OpenCL code generated with our prototype compiler. Our experiments show that we can automatically derive hardware specific implementations from simple functional high-level algorithmic expressions offering performance on a par with highly tuned code for multicore CPUs and GPUs written by experts.

Bio:

Michel Steuwer is a Postdoctoral Research Associate in the compiler group at the University of Edinburgh. He received his PhD in 2015 from the University of Muenster in Germany. His research interests span all areas of parallel programming from languages and programming models to their implementation in compilers and libraries as well as their execution at runtime. His research has particular focused on structured parallel programming models, heterogeneous and GPU computing, and novel compilation techniques.