OpenMPCon 2015 Talk Series 4

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series and third series of talks. The third keynote is also here and we will now describe fourth series of talks.

Want to know how OpenMP is used in US National Labs, especially at NERSC? NERSC is the primary supercomputing facility for Office of Science in the US Depart of Energy (DOE). Our next production system will be an Intel Xeon Phi Knights Landing (KNL) system, with 60+ cores per node and 4 hardware threads per core. The recommended programming model is hybrid MPI/OpenMP, which also promotes portability across different system architectures.

OpenMP usage statistics, such as the percentage of codes using OpenMP, typical number of threads used, etc., on current NERSC production systems will be analyzed. They will describe what they tell their users how to use OpenMP efficiently with multiple compilers on various NERSC systems, including how to obtain best process and thread affinity for hybrid MPI/OpenMP, memory locality with NUMA domains, programming tips for adding OpenMP, strategies for improving OpenMP scaling, how to use nested OpenMP, and tools available for OpenMP. Tuning examples with real scientific user codes will also be presented on improving OpenMP performance.

Manuel Arenaz will demonstrate A Success Case using Parallware. The manual parallelization of existing code is usually a tedious and error-prone task, specially in the case of large projects. Parallware is the first commercial OpenMP-enabling source-to-source compiler that automatically adds OpenMP capabilities in scientific programs. The compiler automatically discovers the parallelism available in sequential codes written in the C programming language. It produces human readable code annotated with OpenMP directives, instead of a binary executable file.In this work we analyze the parallelization of the program EP from the NAS Parallel Benchmarks (NPB) suite. They show through performance results that, starting from the original sequential version and applying some simple code refactorizations, Parallware is able to generate efficient OpenMP parallel code automatically.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s