Back to Table of contents

Primeur weekly 2014-11-10

Special

Explicit Vector Programming with OpenMP 4.0 SIMD Extensions ...

Focus

BIG project develops roadmaps to respond to Big Data opportunities in Europe ...

Embassy Cloud is bringing bio-informatic data analysis to computing and data services infrastructure ...

ANSYS to introduce engineering simulation applications in the Cloud ...

The Cloud

ConPaaS team to release ConPaaS 1.4.2 ...

IBM unveils industry's first intelligent Cloud security portfolio for global businesses ...

EuroFlash

ISC 2015 programme to offer greater diversity ...

Saving lots of computing capacity with a new algorithm ...

New research lights the way to super-fast computers ...

University of Edinburgy - EPCC to issue Call for Participation for the Exascale Applications and Software Conference ...

Two photons strongly coupled by glass fiber ...

HP and Wind River partner to create carrier grade HP Helion OpenStack solutions for NFV ...

Bull to issue Q3 financial results ...

USFlash

Chematria launches search for new Ebola treatments using artificial intelligence ...

HP delivers Flash Storage performance and efficiency without complexity of rip-and-replace ...

Industry collaboration drives 100G foundation for supercomputer infrastructure ...

TACC to expand with $20 million new building in 2016 ...

NVIDIA announces financial results for third quarter fiscal 2015 ...

Stream Integration Announces Availability of the Stream Grid Workbench ...

Oracle announces latest release of Oracle Database Appliance software ...

Eurocom's mobile supercomputer now supports the most powerful graphics, NVIDIA GTX 980M in SLI ...

Explicit Vector Programming with OpenMP 4.0 SIMD Extensions

10 Nov 2014 Santa Clara, Livermore - Modern CPU and GPU processors with on-die integration of SIMD execution units for achieving higher performance and power efficiency have posed challenges to use the underlying SIMD hardware (or VPUs, Vector Processing Unit) effectively. Wide vector registers and SIMD instructions - Single Instructions operating on Multiple Data elements packed in wide registers such as AltiVec [2], SSE, AVX [10] and MIC [9] - pose a compilation challenge that is greatly eased through programmer hints. While many applications implemented using OpenMP [13, 17], a widely accepted industry standard for exploiting thread-level parallelism, to leverage the full potential of today's multi-core architectures, no industry standard has offered any means to express SIMD parallelism. Instead, each compiler vendor has provided its own vendor-specific hints for exploiting vector parallelism, or programmers relied on the compiler’s automatic vectorization capability, which is known to be limited due to many compile-time unknown program factors.

To alleviate the situation for programmers, the OpenMP language committee added SIMD constructs to OpenMP to support vector-level parallelism. These new constructs provide a standardized set of SIMD constructs for programmers who no longer need to use non-portable, vendor-specific vectorization intrinsics or directives. In addition, these SIMD constructs provide additional knowledge about the code structure to the compiler and allow for a better vectorization that blends well with parallelization. To the best of our knowledge, the OpenMP 4.0 specification is the first industry standard that includes explicit vector programming constructs for programmers.

This paper describes the C/C++ and Fortran SIMD extensions for explicit vector programming available in the OpenMP 4.0 specification. We explain the semantics of SIMD constructs and clauses with simple examples. In addition, a set of explicit vector programming guidelines and programming examples are provided in Section 3 and 4 to help programmers to write efficient SIMD programs for achieving a higher performance. Section 5 presents a case study of achieving a~2000xperformance speedup using OpenMP 4.0PARALLELandSIMDconstructs on Intel Xeon Phi coprocessors. Section 6 summarizes this paper.

Keywords: OpenMP, Explicit Vectorization, SIMD programming model, Multicore

The complete article can be downloaded at http://primeurmagazine.com/repository/PrimeurMagazine-AE-PR-12-14-32.pdf

Authors:

Xinmin Tian (1), Bronis R. de Supinski (2)

(1) Intel Corporation, Santa Clara, California USA

(2) Lawrence Livermore National Laboratory, Livermore, California, USA

OpenMP Architecture Review Board (ARB)

Email: xinmin.tian@intel.com , bronis@llnl.gov
Source: OpenMP

Back to Table of contents

Primeur weekly 2014-11-10

Special

Explicit Vector Programming with OpenMP 4.0 SIMD Extensions ...

Focus

BIG project develops roadmaps to respond to Big Data opportunities in Europe ...

Embassy Cloud is bringing bio-informatic data analysis to computing and data services infrastructure ...

ANSYS to introduce engineering simulation applications in the Cloud ...

The Cloud

ConPaaS team to release ConPaaS 1.4.2 ...

IBM unveils industry's first intelligent Cloud security portfolio for global businesses ...

EuroFlash

ISC 2015 programme to offer greater diversity ...

Saving lots of computing capacity with a new algorithm ...

New research lights the way to super-fast computers ...

University of Edinburgy - EPCC to issue Call for Participation for the Exascale Applications and Software Conference ...

Two photons strongly coupled by glass fiber ...

HP and Wind River partner to create carrier grade HP Helion OpenStack solutions for NFV ...

Bull to issue Q3 financial results ...

USFlash

Chematria launches search for new Ebola treatments using artificial intelligence ...

HP delivers Flash Storage performance and efficiency without complexity of rip-and-replace ...

Industry collaboration drives 100G foundation for supercomputer infrastructure ...

TACC to expand with $20 million new building in 2016 ...

NVIDIA announces financial results for third quarter fiscal 2015 ...

Stream Integration Announces Availability of the Stream Grid Workbench ...

Oracle announces latest release of Oracle Database Appliance software ...

Eurocom's mobile supercomputer now supports the most powerful graphics, NVIDIA GTX 980M in SLI ...