FlexVec: Auto-Vectorization for Irregular Loops (PLDI 2016 - Research Papers)

Who

Sara Baghsorkhi, Nalini Vasudevan, Youfeng Wu

Track

PLDI 2016 Research Papers

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 17 Jun 2016 11:00 - 11:30 at Grand Ballroom San Rafael - Parallelism II Chair(s): Albert Cohen

Abstract

Traditional vectorization techniques build a dependence graph with distance and direction information to determine whether a loop is vectorizable. Since vectorization reorders the execution of instructions across iterations, in general instructions involved in a strongly connected component (SCC) are deemed not vectorizable unless the SCC can be eliminated using techniques such as scalar expansion or privatization. Therefore, traditional vectorization techniques are limited in their ability to efficiently handle loops with dynamic cross-iteration dependencies or complex control flow interweaved within the dependence cycles. When potential dependencies do not occur very often, the end-result is under utilization of the SIMD hardware. In this paper, we propose FlexVec architecture that combines new vector instructions with novel code generation techniques to dynamically adjusts vector length for loop statements affected by cross-iteration dependencies that happen at runtime. We have designed and implemented FlexVec’s new ISA as extensions to the recently released AVX-512 ISA. We have evaluated the performance improvements enabled by FlexVec vectorization for 11 C/C++ SPEC 2006 benchmarks and 7 real applications with AVX-512 vectorization as baseline. We show that FlexVec vectorization technique produces a Geomean speedup of 9% for SPEC 2006 and a Geomean speedup of 11% for real applications.

Sara Baghsorkhi

Intel Labs

Nalini Vasudevan

Google

Youfeng Wu

Intel Corporation

FlexVec: Auto-Vectorization for Irregular Loops - Sara Baghsorkhi

slides [pdf]

slides [pptx]