OpenMPCon 2015 Advanced tutorials

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series, third series , and fourth series of talks. The third keynote is also here as are the evening sessions on Grill the Committee and Plan the next OpenMPCon.

I will now describe one of final gems of attending the OpenMP Developers conference along with all the other great talks that reveal the nuts and bolts of OpenMP. The tutorial material offers the latest way to fast track you to being guru at using OpenMP in your work, taught by committee members and educators who are plugged into the design of the specification. We offer a full education range starting with the thoroughly popular and well-tested beginner/intermediate hands-on full coverage of all of OpenMP by Tim Mattson on Monday where the tutorial is based on Active learning! and will mix short lectures with short exercises.

This tutorial is based on a long series of tutorials presented at Supercomputing conferences and  are based on a course he teach with Kurt Keutzer of UC Berkeley.

On Wednesday, along with a regular series of talks and keynotes, one of the track will show case OpenMP senior Educator Ruud Van der Pas teaching why OpenMP REALLY scales. In his characteristic entertaining and annedote-filled manner, Rudd will take a difficult to handle topic how to make OpenMP scale, because unfortunately it is a very widespread myth that OpenMP Does Not Scale – a myth we intend to dispel in this talk.

Tasking models are now everywhere in many standards and specification as they are used to deal with irregular workloads that can not be captured in a parallel loop. Yet  some are heavy weight and some are light weight. Michael Klemm and Christian Terboven, the OpenMPCon and IWOMP Program Chair, respectively will show what OpenMP offers Task and the insider information on how to best take advantage of them.

If you think OpenMP is merely about threading then you might be interested in the latest features of OpenMP 4.x that exploit the SIMD capabilities of modern processors.   Since processors tend to spend more die space for SIMD, growing with every new generation, the so-called “vectorization” becomes more important.  Whereas threading is already covered well, vectorization is still is an underdog.In this tutorial we provide an introduction to vectorization extensions of OpenMP 4.0 and the upcoming version.  Simplified examples extracted from recent Intel Parallel Computing Center projects will be used as demonstration.  Attendees will get a set of different examples to become accustomed with the different vectorization techniques of the latest OpenMP standards.

Want more? OpenMP is the dominant programming model for shared-memory parallelism in C, C++ and Fortran due to its easy-to-use directive-based style, portability and broad support by compiler vendors. Compute-intensive application regions are increasingly being accelerated using specialized heterogeneous devices and a programming model with similar characteristics is needed here. This tutorial on OpenMP 4.x Accelerator Model will focus on the OpenMP 4.0 accelerator model that provides such a programming model.

For this half-day tutorial we assume attendees have a basic understanding of OpenMP concepts. We quickly review OpenMP programming topics that are most relevant to the accelerator model. We focus on how the OpenMP execution and memory models were extended to support heterogeneous devices. We cover the new device constructs and API routines that were added in OpenMP 4.0, and we work through some example code. Finally, we preview some of the upcoming features coming in OpenMP 4.1.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

Grill the OpenMP CEO/ARB/Language Committee members and Plan the next OpenMPCon

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series, third series , and fourth series of talks. The third keynote is also here and we will now describe evening sessions on Grill the Committee and Plan the next OpenMPCon.

So What would you like to know about how the OpenMP Specification happens (especially on OpenMP 4.1 that is scheduled to be released in a month at SC 15) or the membership/organizational changes to the OpenMP ARB, or may be you just like to grill the current or past CEOs?

On Tuesday evening, before the dinner and right after the Tuesday talks will be The Grill the Committee session will offer that chance as the the panel is made up of members of the C++ Standards Committee and the audience asks the questions. Current members anticipate to be there are:

CEOs past and present:

  • Michael Wong
  • Larry Meadows
  • Tim Mattson

ARB and Language members present:

  • Eric Stotzer
  • James Beyer
  • Xinmin Tian
  • Alice Koniges
  • Oscar Hernandez
  • Mark Bull
  • Michael Klemm
  • Yun He

and many others. You ask the questions, and give feedback about OpenMP.

On Wednesday evening, at the end of all the talks and if you still want to stick around (which means you are really interested), or is waiting for the IWOMP to start the next day, we will hold a Planning the next OpenMPCon panel. The next OpenMPCon is tentatively be in Kyoto, Japan sometimes in Oct 2016. We would like to know from you what we did well and what needs improving, but best of all we would want your volunteer help in organizing the next OpenMPCon.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

OpenMPCon 2015 Talk Series 4

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series and third series of talks. The third keynote is also here and we will now describe fourth series of talks.

Want to know how OpenMP is used in US National Labs, especially at NERSC? NERSC is the primary supercomputing facility for Office of Science in the US Depart of Energy (DOE). Our next production system will be an Intel Xeon Phi Knights Landing (KNL) system, with 60+ cores per node and 4 hardware threads per core. The recommended programming model is hybrid MPI/OpenMP, which also promotes portability across different system architectures.

OpenMP usage statistics, such as the percentage of codes using OpenMP, typical number of threads used, etc., on current NERSC production systems will be analyzed. They will describe what they tell their users how to use OpenMP efficiently with multiple compilers on various NERSC systems, including how to obtain best process and thread affinity for hybrid MPI/OpenMP, memory locality with NUMA domains, programming tips for adding OpenMP, strategies for improving OpenMP scaling, how to use nested OpenMP, and tools available for OpenMP. Tuning examples with real scientific user codes will also be presented on improving OpenMP performance.

Manuel Arenaz will demonstrate A Success Case using Parallware. The manual parallelization of existing code is usually a tedious and error-prone task, specially in the case of large projects. Parallware is the first commercial OpenMP-enabling source-to-source compiler that automatically adds OpenMP capabilities in scientific programs. The compiler automatically discovers the parallelism available in sequential codes written in the C programming language. It produces human readable code annotated with OpenMP directives, instead of a binary executable file.In this work we analyze the parallelization of the program EP from the NAS Parallel Benchmarks (NPB) suite. They show through performance results that, starting from the original sequential version and applying some simple code refactorizations, Parallware is able to generate efficient OpenMP parallel code automatically.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

OpenMPCon 2015 talks Series 3

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series of talks. The third keynote is also here and we will now describe third series of talks.

Want to know what it takes to port OpenACC 2.0 to OpenMP 4.0? Oscar Hernandez of Oak Ridge NL has done it and can show you the way as he presents code comparisons to show how each API is used to parallelize representative code fragments. Furthermore, he will give guidelines for developers wishing to convert codes from OpenACC 2.0 to OpenMP 4.0.

Alice Koniges of Lawrence Berkeley NL will describe what it takes to Enable Application portability across HPC platforms using Open Standards with an aim towards User-oriented goals for OpenMP. Portability plus performance are key requirements for large-scale scientific simulations on the path to exascale. Users of the high-end computing facilities such as the National Energy Research Scientific Computing Center (NERSC) and the Oak Ridge Leadership Computing Facility (OLCF) are demanding portable standards to enable their codes to run on differing high performance computing (HPC) architectures with relatively little user intervention between differing versions that have been optimized for performance.

The emerging OpenMP standards are poised to offer such portability. In this presentation, she will discuss several important goals and requirements of portable standards in the context of OpenMP.

Want to know Effective OpenMP SIMD Vectorization for Intel Xeon and Xeon Phi Architectures? There is no better guru then Intel’s Xinmin Tian, who will show how to efficiently exploit SIMD vector units in achieving high performance of the application code running on Intel® Xeon and Xeon Phi™.In this talk, he will present Intel® compiler framework that supports OpenMP4.0/4.1 SIMD extensions, and also present a set of key vectorization techniques such as function vectorization, masking support, uniformity and linearity propagation, alignment optimization, gather/scatter optimization, remainder and peeling loop vectorization that are implemented inside the Intel® C/C++ and Fortran product compilers for Intel® Xeon processors and Xeon Phi™ coprocessors.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

OpenMPCon’s third keynote on Embedded Computing with OpenMP

OpenMPCon  this month aims to bring a stellar lineup of the latest industry gurus, users and developers together with the language designers. As such we have 3 keynotes along with two full day tutorial and a day and a half of talks. You cans see the first keynote, tutorial and the first of three talks here. We also posted the second of three keynotes by Professor William Tang of Princeton University as well as the second series of three talks.

In this post, I would like to announce the third keynote by Exas Instrument’s Eric Stotzer on Towards Programming Embedded Systems with OpenMP.

Software for embedded systems is more complex than in the past, as more functions are implemented on the same device. This talk will provide an overview of the characteristics of embedded systems and discuss features that could be added to OpenMP to enable it to better serve as a programming model for these systems. Embedded systems typically are constrained by among other things real-time deadlines, power-limitations and limited memory resources. Today OpenMP is not able to express these types of constraints. Embedded systems applications can be broadly classified as event-driven or compute and data intensive. OpenMP is well suited to expressing the parallel execution that is demanded by compute and data intensive applications. However, extensions are needed for event-driven applications, such as automotive embedded systems, where the behavior is characterized by a defined sequence of responses to a given incoming event from the environment. While the actions performed may not be compute or data intensive, they tend to have strict real-time constraints. The use of multicore technology has increased the design space and performance of Multiprocessor Systems-on-Chip (MPSoCs) that are targeted at embedded applications. A natural extension is to adapt the device construct added in OpenMP 4.0 to support the mapping of different software tasks, or components, to various processor cores.

Eric Stotzer (Ph.D. Computer Science) is a member of the Software Development Organization’s Compiler team.  He has been at TI 25 years working on software development tools, compilers, architectures, and parallel programming models.

Please consider attending by signing up here. In the mean time, we are looking for student and volunteers to help with the conference. Please connect with OpenMPCon if you wish to help.

Why are we charging for SG14 Games Dev/Low Latency meeting at CPPCON 2015? (and how you can get in for free)

We are less then a month away from CPPCon 2015 near Seattle, the premier event for C++ Boot camp. I have two talks scheduled – one on Memory Model and Atomics in C++11/14/17 and a second one on the Birth of SG14. I also have a Grill the Committee Panel.

But I am most excited about a new add-on event. We showed evidence and interest to the Standard committee at the Lenexa Meeting and was approved to form SG14 a Subgroup for Games development, Low latency, real time, simulation, and I might add banking/finance. That name is a bit too long to be usable. So we have just shortened it to the first two labels. As such, we will be chairing a full day meeting of SG14 in a room at the Meydenbauer Center concurrent with CPPCon on Wednesday, Sept 23rd. This enables Game developers who cannot attend C++ Standard meeting to attend with Committee members present to evaluate their proposals. A second meeting is already set up on March 14-18 2016 at GDC 2015 hosted by Sony (thank you, Sony).

If you have been registering, you will have noticed in the registration page for CPPCon that we are charging $25 to attend SG14. You will wonder why is there a cost to attend an SG14 meeting. The reason is really simple and I will show you a way to get in for free simply for reading this blog!

The SG14 meeting is a real C++ Std working meeting where we will be triaging and evaluating proposals, and giving feedback to authors all day (or until all proposals are done). The room we are given is limited in seating capacity (about 50 I am told). So we need to give preference to paper authors, C++ committee members and truly interested attendees. The conference organizers told me, and I agree that to offer a free but limited seating event would be asking for it to be filled (because what is the harm in signing up if you can just show up and leave) and not allow any space for the people who really need to be there.

So as such, if you are a paper author, or a C++ Committee member who intends to be there most of the day to help evaluate proposals, just send me an email reply on the C++ Standard reflector and I will add you to the protected list who can get in for free. Many people are already on that list but those 50 seats run out fast.

Who are the paper authors? They have been busy discussing on the SG14 reflector some of the following topics where there will be a likelihood a paper will be ready to be discussed. Their authors also have a bye into SG14 and I know who they are.

  • flat_map
  • fixed point
  • uninitialized algorithms
  • string stuff
  • rolling queues
  • intrusive containers
  • EH costs
  • Compare virtual function and see if a class has implementation or not
  • thread safe STL

If you fit none of those (not a C++ std committee member, not on SG14, not a paper author) and still interested in attending, you should join the SG14 discussion first, then email me on SG14, here or my gmail address and ask for a free ticket with some justification as to why you might stick through it all. I will be happy to grant it as the aim of this is truly not to make any money, but one of the few gate keeping method we have of not having a limited capacity room flooded, assuming we have the luxury of having that problem:)

Finally, CPPCon 2015 will have a number of talks which seem to be Games related. I had triaged the list with the help of Sean MiIddleditch and Nicolas Guillermot. I can’t say for sure as I have not contacted each author yet, but in a list of likelihood are the following (thanks to Jon Kalb for sending me the correct CPPCon links):

  • Definitely games related:

    • C++ for cross-platform VR development:

http://cppcon2015.sched.org/event/7212a9da0198fcfd8de5c05be21b667c

    • Testing Battle.net (before deploying to millions of players):

http://cppcon2015.sched.org/event/ac2534ecb08510c5810e7df34cdddb94

    • The current memory and C++ debugging tools used at Electronic Arts:

http://cppcon2015.sched.org/event/a9bccd0c3f6beb05752b36a4197a1deb

    • The Birth of Sg14:

http://cppcon2015.sched.org/event/0404d7fede126851710420c16218cdb9

  • Probably interesting to games developers:

    • Live lock-free or deadlock (practical Lock-free programming)

http://cppcon2015.sched.org/event/595740ce3bab0220cc3c22fa92777830

    • Reflection techniques in C++:

http://cppcon2015.sched.org/event/1d5b459ba8433d8e5effad7a862d599a

    • Cross-Platform Mobile App Development with Visual C++ 2015

http://cppcon2015.sched.org/event/7104a3140c2ba28cdd0a68e323f78eb2

    • How to make your data structures wait-free for reads:

http://cppcon2015.sched.org/event/34d0ca4052e1acad959c725584329dd7

    • C++11/14/17 Atomics the Deep dive: the gory details, before the story consumes you!

http://cppcon2015.sched.org/event/6f91922313cebd5a25369c05a56d4359

    • C++ Atomics: The Sad Story of memory_order_consume: A Happy Ending at Last?

http://cppcon2015.sched.org/event/6d97f88ae259e8103f23830ae350dc30

    • C++ in the Audio Industry

http://cppcon2015.sched.org/event/1cded491a6eeea3a5e5f1541af80a2a7

    • 3D Face Tracking and Reconstruction using Modern C++

http://cppcon2015.sched.org/event/d5f2c8bdd2fbdee420fa24f166f8bdec

    • Implementation of a component-based entity system in modern C++14

http://cppcon2015.sched.org/event/eb915d37a737d8ace0fbb9e4b5892f6d

  • Probably less interesting to games developers:

    • C++ Multi-dimensional Arrays for Computational Physics and Applied Mathematics

http://cppcon2015.sched.org/event/3ec0f48e8500cb20789d2935facca8c5

    • CopperSpice: A Pure C++ GUI Library

http://cppcon2015.sched.org/event/e27044d13660bf65b8a799dac1eff177

I hope to see you at the conference and I am hoping you will attend SG14, despite this $25 charge because now you know how to get in for free!