Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Thursday, November 3
 

8:00am

Registration & Breakfast
Thursday November 3, 2016 8:00am - 9:00am
0 - Lobby Area

9:00am

Welcome
Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation


Thursday November 3, 2016 9:00am - 9:15am
1 - General Session (Rm LL20ABC)

9:15am

ORC -- LLVM's Next Generation of JIT API
ORC is a modular re-implementation of MCJIT that allows for more flexible configuration, better memory management, more fine-grained testing, and easier addition of new features. Its feature set includes of all MCJIT’s current functionality, plus built-in support for lazy and remote compilation. This talk describes ORC’s current features and design concepts, and provides demonstrations of how it can be used.

Speakers
LH

Lang Hames

Apple Inc.


Thursday November 3, 2016 9:15am - 10:15am
1 - General Session (Rm LL20ABC)

10:15am

BREAK
Thursday November 3, 2016 10:15am - 10:30am
0 - Lobby Area

10:30am

Hackers Lab

Topics Include: Clang Tools, Clang (CodeGen, Sema, ..), Static Analyzer, Exception Handling, ORC, MCJIT, Windows Support


Thursday November 3, 2016 10:30am - 11:15am
4 - Hackers Lab (Rm LL21EF)

10:30am

Causes of Performance Instability due to Code Placement in X86
Have you ever experienced significant performance swings in your application after seemingly insignificant changes? A random NOP shifting code addresses causing a 20% speedup or regression? This talk will explore some of the common and not so common architectural reasons why code placement/alignment can affect performance on older and newer x86 processors. Even though ideas will be shared on how to avoid/fix some of these issues in compilers, other very low level issues will not have good compiler solutions, but are still important to recognize for knowledge and identification purposes.

Speakers
avatar for Zia Ansari

Zia Ansari

Principal Engineer, Intel



Thursday November 3, 2016 10:30am - 11:15am
2 - Technical Talk (Rm LL21AB)

10:30am

Intrinsics, Metadata, and Attributes: The story continues!
This talk is a sequel to my talk at the 2014 LLVM Developers' Meeting, in which I discussed @llvm.assume; scoped-noalias metadata; and parameter attributes that specify pointer alignment, dereferenceability, and more. The past two years have seen changes to the metadata representation itself (e.g. distinct vs. uniqued metadata), as well as new metadata that specify pointer alignment, dereferenceability, control loop optimizations, and more. Several new attributes and intrinsics allow for more-detailed control over pointer-aliasing and control-flow properties, and new intrinsics to support masked and scatter/gather memory accesses have been added. Support for older features, such as fast-math flags and the returned attribute, have been significantly extended. I'll explain the semantics of many of these new features, their intended uses, and a few ways they shouldn't be used. Finally, I'll discuss how Clang exposes and leverages these new features to encourage the generation of higher-performance code.

Speakers
HF

Hal Finkel

Argonne National Laboratory


Thursday November 3, 2016 10:30am - 11:15am
1 - General Session (Rm LL20ABC)

11:15am

Hackers Lab
Topics Include: Clang Tools, Clang (CodeGen, Sema, ..), Static Analyzer, Exception Handling, ORC, MCJIT, Windows Support, X86 Backend

Thursday November 3, 2016 11:15am - 12:00pm
4 - Hackers Lab (Rm LL21EF)

11:15am

LLVM Coroutines: Bringing resumable functions to LLVM
Though invented long time ago in 1957, coroutines are getting popular in this century. More and more languages adopt them to deal with lazily produced sequences and to simplify asynchronous programming. However, until recently, coroutines in high level languages were distinctly not a zero-overhead abstraction. We are rectifying that by adding coroutine support to LLVM that allows, finally, for high-level language to have efficient coroutines

In this talk, we will look at coroutine examples in C++ and LLVM IR, at optimization passes that deal with coroutines and at LLVM coroutine representation that C++ and other frontend can use to describe coroutines to LLVM.

LLVM coroutines are functions that can suspend their execution and return control back to their callers. Suspended coroutines can be resumed to continue execution when desired. 

Though coroutine support in LLVM is motivated primarily by the desire to support C++ Coroutines, the LLVM coroutine representation is language neutral and can be used to support coroutines in other languages as well. 

Clang + llvm coroutines allows you to take this code: 

generator<int> range(int from, int to) { 
for(int i = from; i < to; ++i) 
co_yield i; 

int main() { 
int sum = 0; 
for (auto v: range(1,100)) 
sum += v; 
return sum; 


And translate it down to this: 

define i32 @main() #5 { 
entry: 
ret i32 4950 


You can't get any better than that!

Speakers
avatar for Gor Nishanov

Gor Nishanov

Principal Software Engineer, Microsoft
Gor Nishanov is a Principal Software Design Engineer on the Microsoft C++ team. He works on design and standardization of C++ Coroutines, and on asynchronous programming models. Prior to joining C++ team, Gor was working on distributed systems in Windows Clustering team.


Thursday November 3, 2016 11:15am - 12:00pm
1 - General Session (Rm LL20ABC)

11:15am

Scalable Vectorization for LLVM
SVE is a new vector ISA extension for AArch64 targeted at HPC applications; one major distinguishing feature is that vector registers do not have a fixed size from a compiler perspective. This talk will cover the changes made to LLVM IR to support vectorizing loops in a vector length agnostic manner, as well as improvements in vectorization enabled by the predication and gather/scatter features of the extension. See https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture for more details on the architecture.

Speakers
avatar for Amara Emerson

Amara Emerson

Senior Engineer, ARM
Senior Engineer in the HPC compilers and tools group at ARM. Talk to me about HPC, compiler optimizations, research collaborations, auto-vectorization and ARM SVE.
avatar for Graham Hunter

Graham Hunter

Senior Engineer, ARM
Compiler engineer in ARM's HPC and server tools division. OpenMP language committee member. Talk to me about HPC, compiler optimizations (particularly auto-vectorization), OpenMP, and ARM's SVE.


Thursday November 3, 2016 11:15am - 12:00pm
2 - Technical Talk (Rm LL21AB)

12:00pm

Let’s move to GitHub!
Multiple members of the community recently considered seriously a possible move of our repository to git (and GitHub). With hundreds of emails exchanged on the topic, it is worth gathering everyone in the same room to discuss this. We will consider the two main variants of the move proposal detailed here: http://llvm.org/docs/Proposals/GitHubMove.html 
To help driving the discussion, you're invited to fill this survey: https://goo.gl/forms/ZYs0Wv9g0w0ikCRQ2

Speakers
avatar for Mehdi Amini

Mehdi Amini

Compiler Engineer, Apple Inc.



Thursday November 3, 2016 12:00pm - 12:45pm
3 - BoF (Rm LL21CD)

12:00pm

Devirtualization in LLVM
Devirtualization - changing indirect virtual calls to direct calls is important C++ optimization. 
This talk will cover past work on devirtualization including optimizations made by the frontend and by LLVM using !invariant.group and @llvm.assume intrinsic and different LTO tricks. The speaker will also cover interesting problems that he faced and the future work and ideas how to make devirtualization better.

Speakers
avatar for Piotr Padlewski

Piotr Padlewski

University of Warsaw


Thursday November 3, 2016 12:00pm - 12:45pm
1 - General Session (Rm LL20ABC)

12:00pm

Extending LoopVectorizer towards supporting OpenMP4.5 SIMD and outer loop auto-vectorization
Currently, LoopVectorizer in LLVM is specialized in auto-vectorizing innermost loops. SIMD and DECLARE SIMD constructs introduced in OpenMP4.0 and enhanced in OpenMP4.5 are gaining popularity among performance hungry programmers due to the ability to specify a vectorization region much larger in scope than the traditional inner loop auto-vectorization would handle and also due to several advanced vectorizing compilers delivering impressive performance for such constructs. Hence, there is a growing interest in LLVM developer community in improving LoopVectorizer in order to adequately support OpenMP functionalities such as outer loop vectorization and whole function vectorization. In this Technical Talk, we discuss our approaches in achieving that goal through a series of incremental steps and further extending it for outer loop auto-vectorization. 

Speakers
HS

Hideki Saito

Principal Engineer, Intel
15 years focused on shared memory parallelism followed by 10 years focused on vectorization for IA32/Intel64 SIMD extensions. SPEC HPG rep from Intel (2001-2007), involved in the development of SPEC OMP2001, HPC2002, and MPI2007 benchmarks.


Thursday November 3, 2016 12:00pm - 12:45pm
2 - Technical Talk (Rm LL21AB)

12:45pm

LUNCH
Thursday November 3, 2016 12:45pm - 2:15pm
0 - Lobby Area

2:15pm

LLVM Foundation
Meet the new board of directors of the LLVM Foundation. Learn about our programs and mission and how to get involved.

Speakers
HF

Hal Finkel

Argonne National Laboratory
avatar for Arnaud de Grandmaison

Arnaud de Grandmaison

Principal Engineer, ARM
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation


Thursday November 3, 2016 2:15pm - 3:00pm
3 - BoF (Rm LL21CD)

2:15pm

Loop Passes: Adding new features while reducing technical debt
This year LLVM's loop passes have been greatly improved. Along with enabling new algorithms, such as new advanced loop unrolling heuristics, some long-living problems have been addressed, which resulted in significant compile time improvements and, in general, cleaner pass pipeline. We'll talk about the journey we've done along various loop passes, share our thoughts on how to avoid in future some problems we met, and share the methodology we used to find these problems.

Speakers

Thursday November 3, 2016 2:15pm - 3:00pm
1 - General Session (Rm LL20ABC)

2:15pm

rev.ng: a QEMU- and LLVM-based static binary analysis framework
rev.ng is an open-source static binary analysis framework based on QEMU
and LLVM. Its core component, revamb, is a static binary translator
which aims is to translate a Linux program compiled for any of the 17
ISAs supported by QEMU and produce an equivalent binary for a, possibly
different, architecture supported by the LLVM compiler framework. 

revamb aims to translate and re-optimize legacy/closed source programs
but can also be employed for a number of security-related purposes,
such as retrofitting binary hardening techniques (e.g., CFI) or
instrumenting existing binaries with good performance figures (e.g., for
black box fuzzing purposes).

More in general, rev.ng can be used to perform binary analysis on a wide
range of architectures in the comfortable LLVM environment. As an
example, rev.ng can be used to recover high-level information such as
an accurate CFG and function boundaries from a binary program.

At its current status, revamb is able to successfully translate the 105
coreutils binaries compiled for ARM, x86-64 and MIPS and pass over 80%
of coreutils's testsuite on all of them. The programs have been linked
statically, therefore they include handwritten assembly and their text
is in the order of the hundreds of kilobytes.

Speakers
AD

Alessandro Di Federico

PhD student, Politecnico di Milano
I'm interested in several topics concerning the computer security field. My main focus is currently static binary analysis for reverse engineering purposes, but I've also been working in the system security and exploitation fields. I also have a strong interest in privacy, end-to-end encrypted communication systems and in the challenges posed by authentication of public-keys. | I love GNU/Linux and Free Software in general.


Thursday November 3, 2016 2:15pm - 3:00pm
2 - Technical Talk (Rm LL21AB)

3:00pm

Raising Next Generation of LLVM Developers
The LLVM project is a more than decade old. It thrives because of its great 
development community. A constant source of fresh blood in the form of manpower 
and ideas are students. 

There are a lot of challenges in order to help newcomers become more productive 
or better integrated. Getting started, time to first bugfix, time to first accepted patch and time to commit rights are dependent on student's excellence but on the guidance and mentoring, too. Keeping the students around after completing their tasks is often one of the major challenges in mentor's career. 

I'd like to use this BoF to share experience when working with students. I'd like to discuss current issues, challenges and opportunities when working and raising the next generation of LLVM developers.

Speakers

Thursday November 3, 2016 3:00pm - 3:45pm
3 - BoF (Rm LL21CD)

3:00pm

A New Architecture for Building Software
Clang was written in part to deliver fast compile times for C & C++ code. However, the traditional way C compilers integrate with build systems places many limitations on how efficiently that can be done. This talk introduces llbuild -- a new framework for building build systems -- which was designed to help solve this problem, and envisions a new architecture for compiling software which would allow us to significantly improve compilation times for large software projects. 

Speakers
avatar for Daniel Dunbar

Daniel Dunbar

Software Engineer, Apple Inc
I work on build systems and the Swift Package Manager at Apple. I love testing infrastructure. Previously of Clang, KLEE, and Blender3D.


Thursday November 3, 2016 3:00pm - 3:45pm
2 - Technical Talk (Rm LL21AB)

3:00pm

GVN-Hoist: Hoisting Computations from Branches
Code-hoisting identifies identical computations across the program and hoists 
them to a common dominator so as to save code size. Although the main goal of 
code-hoisting is not to remove redundancies: it effectively exposes 
redundancies and enables other passes like LICM to remove more redundancies. 
The main goal of code-hoisting is to reduce code size with the added benefit 
of exposing more instruction level parallelism and reduced register pressure. 

We present a code hoisting pass that we implemented in llvm. It is based on 
Global Value Numbering infrastructure available in llvm. The experimental 
results show an average of 2.5\% savings in code size, although the code size 
increases in many cases because it enables more inlining. This is an 
optimistic algorithm in the sense that we consider all identical computations 
in a function as potential candidates to be hoisted. We make an extra effort 
to hoist candidates by partitioning the potential candidates in a way to 
enable partial hoisting in case common hoisting points for all the candidates 
cannot be found. We also formalize cases when register pressure will reduce as 
a result of hoisting.

Speakers
AK

Aditya Kumar

Senior Compiler Engineer, Samsung Austin R&D Center
I've been working on LLVM since 2012. I've contributed to GVNHoist, Hexagon specific optimizations, clang static analyzer, libcxx
avatar for Sebastian Pop

Sebastian Pop

Samsung Austin R&D Center
Loop optimizations, testing, benchmarks, performance tracking.


Thursday November 3, 2016 3:00pm - 3:45pm
1 - General Session (Rm LL20ABC)

3:45pm

BREAK
Thursday November 3, 2016 3:45pm - 4:15pm
0 - Lobby Area

4:15pm

Enhancing LLVM’s Floating-Point Exception and Rounding Mode Support
In this BoF session, we will discuss ways in which LLVM can be extended to support non-default floating-point behavior. Topics will include respecting FP rounding modes in optimization passes, preserving FP exception status, avoiding false FP exceptions and enabling run-time handling of FP exceptions. 

Speakers
avatar for Andy Kaylor

Andy Kaylor

Sr. Software Engineer, Intel
I've been a tools developer at Intel for 17 years and have been working with LLVM since 2012, contributing to areas such as MCJIT, LLDB and Windows exception handling. I'm about to dive into LLVM's representation and handling of floating point operations.
avatar for David Kreitzer

David Kreitzer

Principal Engineer, Intel
I have spent the majority of my adult life getting the Intel compiler to generate superb code for x86 processors. I developed many major pieces of functionality in the Intel compiler's back end including its register allocator. My current focus is to draw on that experience to help improve LLVM's generated code for x86.



Thursday November 3, 2016 4:15pm - 5:00pm
3 - BoF (Rm LL21CD)

4:15pm

Hackers Lab

 Topics Include: Polly, Backends, Code Generation, SystemZ, PowerPC, ARM, AArch64, Vectorization, LLVM TableGen, MC Layer, Register Allocation, Instruction Selection, OpenMP 

Thursday November 3, 2016 4:15pm - 5:00pm
4 - Hackers Lab (Rm LL21EF)

4:15pm

Lightning Talks

1) MemorySSA in Five Minutes - George Burgess Iv
Abstract: MemorySSA is a utility that has recently been landed in LLVM. This talk will give a high-level introduction to what MemorySSA is and how we expect to use it 

2) Polly as an analysis pass in LLVM - Utpal Bora
In this talk, we will introduce a new interface to use polyhedral dependence analysis of Polly in LLVM transformation passes such as Loop Vectorizer. As part of GSoC 2016, we implemented an interface to Polly, and provided new APIs that can be used as an Analysis pass within LLVM's transformation passes. We will describe our implementation and demonstrate some loop transformations using the new interface (PolyhedralInfo). Details on GSoC- http://utpalbora.com/gsoc/2016.html

3) RISC-V: Towards a reference LLVM backend Alex Bradbury
This talk will present work towards establishing RISC-V as a reference quality backend in LLVM. By maintaining a regularly updated patchset that implements a production quality backend alongside a companion tutorial, we can make LLVM development accessible to a much wider audience. We will explore the work that has been done to reach these goals, problems faced, and how you can contribute. 

4) Error -- Structured Error Handling in LLVM Lang Hames
LLVM’s new Error scheme enables rich error handling and recovery by supporting user-defined error types and strict requirements on error checking. This talk provides an overview of how the scheme works, and how it can be used in your code.

5) Reducing the Computational Complexity of RegionInfo - Nandini Singhal
The LLVM RegionInfo pass provides a convenient abstraction to reason about independent single-entry-single-exit regions of the control flow graph. RegionInfo has proven useful in the context of Polly and the AMD GPU backend, but the quadratic complexity of RegionInfo construction due to the use of DominanceFrontier makes the use of control flow regions costly and consequently prevents the use of this convenient abstraction. In this work, we present a new approach for RegionInfo construction that replaces the use of DominanceFrontier with a clever combination of LoopInfo, DominanceInfo, and PostDominanceInfo. As these three analysis are (or will soon be) preserved by LLVM and consequently come at zero cost while the quadratic cost of DominanceFrontier construction is avoided, the overall cost of using RegionInfo has been largely reduced, which makes it practical in a lot more cases. Several other problems in the context of RegionInfo still need to be addressed. These include how to remove the RegionPass framework which makes little sense in the new pass manager, how to connect Regions and Loops better, and how we can move from a built-upfront-everything analysis to an on-demand analysis, which step-by-step builds only the needed regions. We hope to discuss some of these topics with the relevant people of the LLVM community as part of the poster session.

6) Toward Fixed-point Optimization in LLVM Nathan Wilson
As one might imagine, LLVM’s optimization pipeline is not universally optimal. Running the available optimization passes in a different order, and changing the number of times each runs, can improve performance on some programs. Anecdotal evidence has suggested that simply running the current optimization pipeline multiple times often yields better-performing programs. Given compile-time constraints, simply duplicating the current pipeline for all inputs would be unacceptable for many users. How many times is enough? What if we could be smarter about it to get the performance benefits without the compile-time cost? To answer this question, we implemented a fixed-point optimization scheme in LLVM and evaluated its use when compiling LLVM’s test suite. Under this scheme, the function-pass pipeline would continue to execute while the input IR differed from the final IR. This experiment revealed that we will be able to capture the performance benefits of repeating the pipeline at reduced cost, and that four times is enough.

7) FileCheck Follies Paul Robinson
FileCheck is *the* critical tool for LLVM testing.  See how to make it NOT do what you want! Watch it silently pass a bogus test! All examples REAL and "ripped from the headlines" of the commits list! 


Speakers
avatar for Alex Bradbury

Alex Bradbury

Co-founder and Director, lowRISC CIC
LH

Lang Hames

Apple Inc.
PR

Paul Robinson

Staff Compiler Engineer, Sony Computer Entertainment America


Thursday November 3, 2016 4:15pm - 5:00pm
1 - General Session (Rm LL20ABC)

5:00pm

Representing composite SIMD operations in LLVM-IR
Loop Vectorizer currently translates scalar operations into a new sequence of SIMD operations where every operation can be represented very naturally in LLVM-IR using its native instructions and vector operands. This sequence passes though common and target-specific optimizations before being lowered to target code. 
However, we aim to work with composite operations or idioms during vectorization. This requirement stems from the rich vector ISA’s supported by targets – SIMD instruction sets include CISC-like operations such as clamping or saturating arithmetic, multiply-and-accumulate-pairs or sum-of-absolute-differences. Two additional categories of such non-primitive vector operations are nonconsecutive forms of memory accesses and masked vector operations. Current LLVM-IR can support such idioms by patterns of instructions or intrinsics, making cost estimation and/or following optimization steps problematic. 
As a part of our drive to enhance vectorization in LLVM, we are revisiting the ability of LLVM-IR to represent composite SIMD operations. 
In this session we’d like to discuss the different categories of composite SIMD operations, alternative approaches to represent them and propose a new generic solution. 

Speakers

Thursday November 3, 2016 5:00pm - 5:45pm
3 - BoF (Rm LL21CD)

5:00pm

Hackers Lab
Topics Include: Polly, Backends, Code Generation, SystemZ, PowerPC, ARM, AArch64, Vectorization, LLVM TableGen, MC Layer, Register Allocation, Instruction Selection, OpenMp 

Thursday November 3, 2016 5:00pm - 5:45pm
4 - Hackers Lab (Rm LL21EF)

5:00pm

Lightning Talks

1) Stack-use-after-scope detector in AddressSanitizer - Vitaly Buka

Stack-use-after-scope is the check in AddressSanitizer which detects access to variables from outside the scope it was declared. This talk covers issues we had to resolve to make the feature usable, results of applying the check to the Google code, and examples of bugs it detected.

2) How compiler frontend is different from what IDE needs?  Ilya Biryukov
We’ve been writing our own C++ frontends at JetBrains for a few years now. Given that most people use clang these days, this may come as a surprise that we don’t. In this small talk we’ll cover the reasons that were driving our decision to roll out our own implementation and try to highlight how it’s different from what’s being done in clang. 

3) Enabling Polyhedral Optimizations in Julia - Matthias Reisinger
Julia is a relatively young programming language with a focus on technical computing. While being dynamic it is designed to achieve high performance comparable to that of statically compiled languages. The execution of Julia programs is driven by a just-in-time compiler that relies on LLVM to produce efficient code at run-time. This talk highlights the recent integration of Polly into this environment which has enabled the use of polyhedral optimization in Julia programs.

4) Extending Clang AST Matchers to Preprocessor Constructs Jeff Trull
Clang's libTooling provides powerful mechanisms for identifying and modifying source code via the AST. However, parts of the source code are hidden or obfuscated from these tools due to the action of the preprocessor. This is particularly true of legacy code, where applying refactoring tools is highly desirable. The speaker will demonstrate how to write an AST Matcher that identifies sections of code associated with preprocessor conditional directives, and will make suggestions on how to improve tooling in this area.

5)  SMACK Software Verification Toolchain - Zvonimir Rakamaric
Tool prototyping is an essential step in developing novel software verification algorithms and techniques. However, implementing a verifier prototype that can handle real-world programs is a large endeavor. In this talk, we present the SMACK software verification toolchain. SMACK provides a modular and extensible software verification ecosystem that decouples the front-end source language details from back-end verification algorithms. It achieves that by translating from the LLVM compiler intermediate representation into the Boogie intermediate verification language. SMACK offers the following benefits: (1) it can be used as an automated off-the-shelf software verifier in an applied software verification project, (2) it enables researchers to rapidly develop and release new verification algorithms, and (3) it allows for adding support for new languages in its front-end. We have used SMACK to verify numerous C/C++ programs, including industry examples, showing it is mature and competitive. Likewise, SMACK is already being used in several existing verification projects.

6) Finding code clones in the AST with clang Raphael  Isemann
This talk will introduce clang’s new clone detection framework that uses hash-code comparison to search for groups of AST nodes that are similar in a certain configurable sense.


Speakers
IB

Ilya Biryukov

JetBrains
avatar for Vitaly Buka

Vitaly Buka

Software Engineer, Google Inc
avatar for Raphael  Isemann

Raphael Isemann

Mail: teemperor@gmail.com
avatar for Jeff Trull

Jeff Trull

Trull Consulting
Electronic CAD algorithms | Modern C++ design patterns


Thursday November 3, 2016 5:00pm - 5:45pm
1 - General Session (Rm LL20ABC)

6:00pm

RECEPTION
Food and drinks will be served. For registered reception attendees only please.

Thursday November 3, 2016 6:00pm - 9:00pm
0 - Lobby Area
 
Friday, November 4
 

9:00am

ThinLTO: Scalable and Incremental LTO
ThinLTO was first introduced at EuroLLVM 2015 as "A Fine-Grained Demand-Driven Infrastructure". The presentation was based on an early prototype made as a proof-of-concept. Taking this original concept, we redesign it from scratch in LLVM by extending the bitcode format, redesigning the high-level workflow to remove the "demand-driven" iterative part, and adding new capabilities such as the incremental build support. We added supports in two linkers: Gold on Linux and ld64 on Darwin. 

We propose in this presentation to go through the final design and how it is implemented in LLVM.

Speakers
avatar for Mehdi Amini

Mehdi Amini

Compiler Engineer, Apple Inc.
avatar for Teresa Johnson

Teresa Johnson

Software Engineer, Google


Friday November 4, 2016 9:00am - 10:00am
1 - General Session (Rm LL20ABC)

10:00am

Where are my variables? Debug info for optimized code
While the quality of debug info at -O0 has reached a satisfactory level, debugging code that was optimized by LLVM still poses a challenge, primarily because variable locations may get dropped at any time in the compilation. 

We will start by presenting statistics aimed at identifying the worst offenders among the compilation stages and highlight known problems including debug value location tracking in the backend, the register allocator, optimizing transformations, and shortcomings of LLVM IR, before opening the floor to a discussion on strategies for improving the quality of debug info for optimized code.

Speakers
avatar for Adrian Prantl

Adrian Prantl

Apple
Ask me about debug information in LLVM, Clang and Swift!


Friday November 4, 2016 10:00am - 10:45am
3 - BoF (Rm LL21CD)

10:00am

Killing poison and undef -- long live poison!

The current concept of poison in LLVM is known to be broken, leaving LLVM in a state where certain miscompilation bugs are hard or even impossible to fix. Moreover, the concepts of poison and undef values in LLVM are hard to reason about and are often misunderstood by developers.

However, we need concepts similar to poison and undef to enable certain important optimizations.

In this talk, we will present the motivation behind poison and undef and why they are broken. We'll also present a proposal to kill undef and extend poison, while retaining their performance benefits. 

This talk is meant to increase awareness of the issues and motivations behind poison/undef and discuss how to fix it.

Joint work with: Sanjoy Das, Gil Hur, Yoonseung Kim, Juneyoung Lee, David Majnemer, John Regehr, and Youngju Song.



Speakers
JL

Juneyoung Lee

Seoul National University
NL

Nuno Lopes

Researcher, Microsoft Research


Friday November 4, 2016 10:00am - 10:45am
1 - General Session (Rm LL20ABC)

10:00am

Leveraging Intermediate Forms for Analysis
In this presentation we will discuss and demonstrate an approach to build various Formal Methods (FM) tools leveraging LLVM. FM has seen a significant increase in usage in software over the past decade, being used in critical system design, security, and prototyping. We will discuss the benefits and drawbacks of LLVM IR for FM and the need for an Abstract Representation (AR) that allows for the analysis via engineering approximations. In particular we want to talk about our approach and tools that mapped our chosen AR, developed at NASA, and then extending our initial set of analysis into more logical and hierarchal relationship. Lastly we want to present what we feel are the difficulties, future challenges and successes of FM tools integrating with LLVM community. 

Speakers
AS

Ayal Spitz

Intrepid


Friday November 4, 2016 10:00am - 10:45am
2 - Technical Talk (Rm LL21AB)

10:45am

BREAK
Friday November 4, 2016 10:45am - 11:15am
0 - Lobby Area

11:15am

Hackers Lab
Topics Include: Mid-level Optimizations, Pass Manager, OpenCL

Friday November 4, 2016 11:15am - 12:00pm
4 - Hackers Lab (Rm LL21EF)

11:15am

Compiler-assisted Performance Analysis
Optimization diagnostics have been part of LLVM for years. While important, these diagnostics had a narrow focus on providing user feedback on the success or failure of Auto-vectorization. This work explores the possibility of extending on this foundation in order to build up a full-fledged performance analysis tool set using the compiler. The talk will first lay out the elements of this tool set. Then we will evaluate and refine it through an exploration of real- world use cases.

Speakers
AN

Adam Nemet

Apple Inc.


Friday November 4, 2016 11:15am - 12:00pm
2 - Technical Talk (Rm LL21AB)

11:15am

Global Instruction Selection Status
Last year we presented a proposal to bring up a new instruction
selection framework, GlobalISel, in LLVM. This talk will show the progress made with the design and implementation of that proposal as well as pointing out the areas that need to be develop. 

As a backend developer, you will learn what it takes to start using GlobalISel for your target and as a LLVM developer, you will see which aspects of GlobalISel require your contributions. 

Speakers
AB

Ahmed Bougacha

Apple Inc.
avatar for Quentin Colombet

Quentin Colombet

Apple Inc.
TN

Tim Northover

Apple Inc.


Friday November 4, 2016 11:15am - 12:00pm
1 - General Session (Rm LL20ABC)

12:00pm

Shipping Software as LLVM IR
Many members of the LLVM community from both industry and academia
are working towards addressing an important problem:
shipping software as LLVM IR for more flexible analysis and transformation.
Examples of these efforts include technologies such as `-fembed-bitcode`,
ThinLTO, and WLLVM.

We propose a BoF for these parties and all interested to
meet and discuss the benefits and technical challenges involved,
learn about each others' goals and use-cases, and to identify
collaboration opportunities across these overlapping projects.

Our interest:
We at UIUC are developing a system called "ALLVM" in which
all components are represented as LLVM IR first and
foremost. Our goal is to explore the potential benefits of
the approach for improving performance, strengthening security,
and simplifying failure diagnosis for production code.
A second goal is to make ALLVM available widely as a platform for 
research. As part of this ongoing project we are
developing and automating the construction of complete
LLVM-based representations of real-world software, as well
as building an ecosystem of supporting tools.

Speakers
VA

Vikram Adve

University of Illinois, Urbana-Champaign
avatar for Will Dietz

Will Dietz

Grad Student, UIUC


Friday November 4, 2016 12:00pm - 12:45pm
3 - BoF (Rm LL21CD)

12:00pm

Hackers Lab
Topics Include: Mid-level Optimizations, Pass Manager, OpenCL

Friday November 4, 2016 12:00pm - 12:45pm
4 - Hackers Lab (Rm LL21EF)

12:00pm

Dealing with Register Hierarchies
Many architectures allow addressing parts of a register independently. Be it 
the infamous high/low 8 bit registers of X86, the 32/64bit addressing modes of 
X86-64 and AArch64 or GPUs with wide loads and stores where with computation on 
sub register lanes. 

LLVM recently gained support to track liveness on subregister granularity. In 
combination with improved heuristics for register classes of varying sizes the 
average register count decreased for 20% for GPU shader programs. 

This talk gives an introduction to typical situations benefiting from sub 
register liveness modeling. It shows how a target architecture developer can 
model them and explains the register allocation techniques employed by llvm.

Speakers
avatar for Matthias Braun

Matthias Braun

Apple Inc.
I am an LLVM developer working on the code generation part of the compiler, specifically register allocation and scheduling.


Friday November 4, 2016 12:00pm - 12:45pm
1 - General Session (Rm LL20ABC)

12:45pm

LUNCH
Friday November 4, 2016 12:45pm - 2:15pm
0 - Lobby Area

2:15pm

PIR - A parallel LLVM-IR
Extending an existing compiler intermediate language with parallel constructs is 
a challenging task. Maintainability dictates a minimal extension that will not 
disturb too many of the existing analyses and transformations. At the same time, 
the parallel constructs need to be powerful enough to express different, 
orthogonal execution scenarios. For C/C++, OpenMP is one of the most 
prominent parallelization frameworks that, on its own, allows for multiple 
parallelization schemes. Additionally, other parallel languages such as 
OpenCL, CUDA or Cilk++ would profit from the 
translation to lower level parallel constructs. Finally, automatic parallelizers 
and new (partially) parallel languages such as Julia can be utilized best 
with general parallel constructs that allow to express parallel (or better 
concurrent) execution in an independent and intuitive way throughout the 
compilation. 

In this BOF we want to continue the discussion about PIR, a 
parallel extension of the LLVM-IR. The discussion began in the context of the 
LLVM-HPC working group on IR extensions for parallelization. We will introduce 
the design and concepts behind PIR and shortly report on the lessons learned 
during the ongoing development of our prototype. In the course of this 
introduction we will talk about the goals, common problems as well as use cases 
that motivated our current design. Afterwards we will initiate an open ended 
discussion with the audience for which we allocate the majority of time (≈20 minutes). 

Speakers
avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University
SM

Simon Moll

Researcher/PhD Student, Saarland University


Friday November 4, 2016 2:15pm - 3:00pm
3 - BoF (Rm LL21CD)

2:15pm

CodeView, the Microsoft Debug Info Format, in LLVM
The number one piece of feedback we've heard from Windows users of Clang is that they want to be able to debug their programs in Visual Studio. More than that, though, there is a world of Windows tools, such as profilers, post-mortem crash analyzers, self-debugging tools (dbghelp), and symbol servers, that makes it really worth implementing CodeView support in LLVM. Since the last dev meeting, we've been hard at work studying the format and slowly adding support for it to LLVM. This talk will give an overview of the format, and then go back and focus on the aspects that most impacted our design decisions in Clang and LLVM. As others in the community have discovered while working on LTO, LLDB, modules, and llvm-dsymutil, type information can often end up being the dominating factor in the performance of the toolchain. CodeView has some interesting design choices for solving that problem that I will share. I will close by talking about where we want to go in the future, and how we will eventually use LLD to package our CodeView into a PDB file.

Speakers
RK

Reid Kleckner

Software Engineer, Google
I work on Clang, the C++ compiler. I specifically work on C++ ABI compatibility with MSVC, and other Windows-related issues in Clang.


Friday November 4, 2016 2:15pm - 3:00pm
2 - Technical Talk (Rm LL21AB)

2:15pm

Summary-based inter-unit analysis for Clang Static Analyzer
The ability to perform interprocedural analysis is one of the most powerful features of Clang Static Analyzer. This talk is devoted to the ongoing improvement of this feature. We will discuss our implementation of summary-based interprocedural analysis as well as cross translation unit analysis. These features allow faster analysis with a greater number of potentially found bugs. We are going to describe our implementation details and approaches and discuss their pros and cons.

Speakers
avatar for Aleksei Sidorin

Aleksei Sidorin

Leading Software Engineer, Samsung R&D Institute Russia


Friday November 4, 2016 2:15pm - 3:00pm
1 - General Session (Rm LL20ABC)

3:00pm

Performance improvements in libcxx
We want to discuss current state of libcxx performance w.r.t. libstdc++. What are the regressions that we know and plan to fix them. A few of the regressions are: 

PR21192 - Reading from stdin is 1-2 orders of magnitude slower than using libstdc++ 
PR19708 - std::find is significantly slower than libstdc++. 
PR20837 - libc++'s std::sort is O(N^2) in the worst case (instead of O(N*ln(N))). 
PR26886 - libc++'s std::stable_sort also has a worst-case complexity issue. 
PR15456 - A faster implementation of std::function is possible 
PR16747 and PR21275 - Our unordered_multimap insert is much slower than libstdc++'s 
- Removing Heap usage 
- Missing features (and performance issues) in <regex> 
- Non-conforming allocators in the test suite 
- Identifying missing attributes in functions 
- Debug mode for the library

Speakers
avatar for Marshall Clow

Marshall Clow

Principal Engineer, Qualcomm
Marshall is a long-time LLVM and Boost participant. He is a principal engineer at Qualcomm, Inc. in San Diego, and the code owner for libc++, the LLVM standard library implementation. He is the author of the Boost.Algorithm library and maintains several other Boost libraries.
AK

Aditya Kumar

Senior Compiler Engineer, Samsung Austin R&D Center
I've been working on LLVM since 2012. I've contributed to GVNHoist, Hexagon specific optimizations, clang static analyzer, libcxx
avatar for Sebastian Pop

Sebastian Pop

Samsung Austin R&D Center
Loop optimizations, testing, benchmarks, performance tracking.


Friday November 4, 2016 3:00pm - 3:45pm
3 - BoF (Rm LL21CD)

3:00pm

Developing and Shipping Clang with CMake
In LLVM 3.8 the autoconf build system was deprecated and it was removed in favor of the newer CMake system starting in 3.9. This talk provides a brief introduction to the CMake programming language to ensure everyone basic familiarity. It will include a post-mortem on the LLVM autoconf->CMake transition, and discuss some of the useful features of the LLVM CMake build system which can improve developer productivity. We will explore a case study on packaging and shipping an LLVM toolchain with CMake including an in-depth explanation of many of the new features of the LLVM CMake build system. Lastly it will provide a status report of the current state of the build system as well as presenting some of the future improvements on the horizon.

Speakers
CB

Chris Bieneman

LLVM Engineer, Apple Inc


Friday November 4, 2016 3:00pm - 3:45pm
2 - Technical Talk (Rm LL21AB)

3:00pm

Reducing Code Size Using Outlining
Maintaining a low code size overhead is important in computing domains where memory is a scarce resource. Outlining is an optimization which identifies similar regions of code and replaces them with calls to a function. This talk introduces a novel method of compressing code using an interprocedural outliner on LLVM MIR.

Speakers
avatar for Jessica Paquette

Jessica Paquette

Apple Inc.
I like graph theory, making code small, dogs, and lifting weights.


Friday November 4, 2016 3:00pm - 3:45pm
1 - General Session (Rm LL20ABC)

3:45pm

POSTER SESSION

CFL-based context sensitive alias analysis in LLVM - Jia Chen
This poster presents CFL-Steens-AA and CFL-Anders-AA.

Enabling Polyhedral Optimizations in Julia - Matthias Reisinger
Julia is a relatively young programming language with a focus on technical computing. While being dynamic it is designed to achieve high performance comparable to that of statically compiled languages. The execution of Julia programs is driven by a just-in-time compiler that relies on LLVM to produce efficient code at run-time. This poster highlights the recent integration of Polly into this environment which has enabled the use of polyhedral optimization in Julia programs.

Towards a generic accelerator offloading approach: implementing OpenMP 4.5 offloading constructs in Clang and LLVM - Gheorghe-Teodor Bercea
The OpenMP 4.5 programming model enables users to run on multiple types of accelerators from a single application source code. Our goal is to integrate a high-performance implementation of OpenMP’s programming model for accelerators in the Clang/LLVM project. This poster is a snapshot of our ongoing efforts towards fully supporting the generation of code for OpenMP device offloading constructs. We have submitted several Clang patches that address some of the major issues that, in our view, prevent the adoption of a generic accelerator offloading strategy. At compiler level, we introduce a new OpenMP-enabled driver implementation which generalizes the current Clang-CUDA approach. The new driver can handle the compilation of several host and device architecture types and can be extended to other offloading programming models such as OpenACC. We developed libomptarget, a runtime library that supports execution of OpenMP 4.5 constructs on NVIDIA architectures and is extensible to other ELF-enabled devices. In this poster we describe two features of libomptarget: the mapping of data to devices and compilation of code sections for different architectures into a single binary. The aforementioned changes have been integrated locally with the Clang/LLVM repositories resulting in a fully functional OpenMP 4.5 compliant prototype. We demonstrate the robustness of our extensions and show preliminary performance results on the LULESH proxy application.

Polly as an analysis pass in LLVM - Utpal Bora
In this talk, we will introduce a new interface to use polyhedral dependence analysis of Polly in LLVM transformation passes such as Loop Vectorizer. As part of GSoC 2016, we implemented an interface to Polly, and provided new APIs that can be used as an Analysis pass within LLVM's transformation passes. We will describe our implementation and demonstrate some loop transformations using the new interface (PolyhedralInfo). Details on GSoC- http://utpalbora.com/gsoc/2016.html

Reducing the Computational Complexity of RegionInfo Nandini Singhal
The LLVM RegionInfo pass provides a convenient abstraction to reason about independent single-entry-single-exit regions of the control flow graph. RegionInfo has proven useful in the context of Polly and the AMD GPU backend, but the quadratic complexity of RegionInfo construction due to the use of DominanceFrontier makes the use of control flow regions costly and consequently prevents the use of this convenient abstraction. In this work, we present a new approach for RegionInfo construction that replaces the use of DominanceFrontier with a clever combination of LoopInfo, DominanceInfo, and PostDominanceInfo. As these three analysis are (or will soon be) preserved by LLVM and consequently come at zero cost while the quadratic cost of DominanceFrontier construction is avoided, the overall cost of using RegionInfo has been largely reduced, which makes it practical in a lot more cases. Several other problems in the context of RegionInfo still need to be addressed. These include how to remove the RegionPass framework which makes little sense in the new pass manager, how to connect Regions and Loops better, and how we can move from a built-upfront-everything analysis to an on-demand analysis, which step-by-step builds only the needed regions. We hope to discuss some of these topics with the relevant people of the LLVM community as part of the poster session.

Binary Decompilation to LLVM IR - Sandeep Dasgupta
This work is about developing a binary to LLVM IR translator to generate higher quality IR than that generated by the existing tools. Such an IR includes variable information, type information and individual stack frames per procedure, which in turn facilitates many sophisticated analysis and optimizations. We are using an open source tool McSema for the purpose and our goal is to extend the tool to 1) extract variable and type information, 2) improve the quality of recovered IR by mitigating some of its limitations and 3) re-construct stack for each procedure. The current status is we have extended the McSema recovered IR to re-construct the stack for each procedure which in turn will help in doing variable recovery and its promotion.

Dynamic Autovectorization - Joshua Cranmer
We present our ongoing work on augmenting LLVM with a dynamic autovectorizer. This tool uses dynamic information to circumvent the shortfalls of imprecise static analysis when performing loop vectorization, as well as leveraging dynamic transformations of code and memory to make autovectorization and other optimization passes more effective. The key transformations we illustrate in this poster are the extraction of hot paths in innermost loops (with a current speedup of 5% on SPEC against vanilla LLVM) and the conversion of memory from array-of-structs to a struct-of-array representation.

RV: A Unified Region Vectorizer for LLVM Simon Moll
The Region Vectorizer (RV) is a general-purpose vectorization framework for LLVM. RV provides a unified interface to vectorize code regions, such as inner and outer loops, up to whole functions. Being a vectorization framework, RV is not another vectorization pass but rather enables users to vectorize IR directly from their own code. Currently, vectorization in LLVM is performed by stand-alone optimization passes. Users who want to vectorize IR have to roll their own vectorization code or hope for the existing vectorization passes to operate as the user intends them to. Polly, for example, features a simple built-in vectorizer but also communicates with LLVMs loop vectorizer through metadata. All these vectorizers pursue the same goal that of vectorizing some code region. However, their quality varies wildly and their code bases are redundant. In contrast, with RV users vectorize IR directly from their own code and through a simple unified API. The current prototype is a complete re-design of the earlier Whole-Function Vectorizer by Ralf Karrenberg. Unlike the Whole-Function Vectorizer or any vectorizer in LLVM, RV operates on regions, which are a more general concept. In terms of RV, a valid region is any connected subgraph of the CFG including loop nests. Regions make RV applicable for inner and outer loop vectorization. At the same time, RV attains the capability of its predecessor to vectorize functions into SIMD signatures. However, Whole-Function Vectorization is now only one of many possible use cases for RV. The current prototype of RV implements all stages of a full vectorization pipeline. However, users can compose these stages as they see fit, inserting and extracting IR and analysis information at any point. 

Robustness Enhancement of Baggy Bounds Accurate Checking in SAFECode - Zhengyang Liu
Baggy Bounds Accurate Checking(BBAC) is a compile-time transform and runtime hardening solution that detects out-of-bounds pointer arithmetic errors. The original version of BBAC implemented on SAFECode is not robust and efficient enough for real world use. Our work has improved the robustness and performance of the SAFECode's BBAC implementation, by fixing the bugs in the compile-time transform passes and runtime checking functions, as well as inlining several runtime checking functions. The latest implementation of BBAC achieves reasonable robustness and performance on various real world applications.

Extending Clang C++ Modules System Stability - Bianca-Cristina Cristescu
The current state of the Module System, although fairly stable, has a few bugs for C++ support. Currently, the method for ensuring no regressions is a buildbot for libc++, which builds llvm in modules self-hosted mode. Its main purpose is to find bugs in clang’s implementation and ensure no regression for the ongoing development. We propose a flow for finding bugs, submitting them alongside with a minimal reproducer to the Clang community, and subsequently proposing a fix for the issue which can be reviewed by the community. The poster will emphasise, besides the common complexity of minimising an issue, a comparison between the labour required with and without the methodology proposed.


Speakers

Friday November 4, 2016 3:45pm - 4:45pm
0 - Lobby Area

4:45pm

Polly Loop Optimization and Accelerator Compilation
Loop Optimization is important for high-performance computing but even
more for fast image processing, machine learning, and accelerator
programming. Over the last year the Polly loop optimization framework
has significantly evolved, with new support for data-layout
transformations, optimization of dense linear algebra kernels, and fully
automatic accelerator mapping support. Many of  these transformations
have been contributed by developers all over the world, including three
summer of code students. This BoF serves as a place for core developers
to gather, to discuss the current status of Polly, and to shape the 2016
development agenda of Polly. Hot topics are likely new automatic GPGPU
code generation facilities, recent improvements on correctness and
compile time, the new outer loop vectorization, and the recent addition
of @polly support in Julia. The Polly code base also relies heavily on
scalar evolution, value range analysis, and can serve as a basis
for performance, memory footprint, and data transfer modeling. We invite
Polly developers and all other interested developers.


Speakers
TG

Tobias Grosser

ETH Zurich


Friday November 4, 2016 4:45pm - 5:30pm
3 - BoF (Rm LL21CD)

4:45pm

Toy programming demo of a repository for statically compiled programs
This talk will present a proof of concept of an approach which improves compile and link times by replacing the conventional use of object files with an incrementally updated repository without requiring change to existing build infrastructure. It aims to present the idea at a high-level using a live demo of some trivial tools and initiate discussion of a real implementation within the LLVM framework. 

Speakers
avatar for Paul Bowen-Huggett

Paul Bowen-Huggett

SN Systems Ltd.


Friday November 4, 2016 4:45pm - 5:30pm
1 - General Session (Rm LL20ABC)

4:45pm

Using LLVM to guarantee program integrity
There are many embedded systems on which we rely heavily in our day to day lives, and for these it is crucial to ensure that these systems are as robust as possible. To this end, it is important to have strong guarantees about the integrity of running code. Achieving this naturally requires close integration between hardware features and compiler toolchain support for these features. 

To achieve this, an NXP architecture uses hardware signing to ensure integrity of a program's control flow from modification. Each instruction's interpretation depends on the preceding instruction in the execution flow (and hence the sequence of all preceding instructions). Basic blocks require a “correction value” to bring the system into a consistent state when arriving from different predecessors. Compiler support is needed for this such that compiled code can receive the benefits of this feature. 

Over the past year we have implemented the infrastructure for this feature which can be enabled on a per-function level in LLVM, for functions written in both C and/or assembly. In this talk we will present this system, and show how it enforces control flow integrity. 

We will explain how we have extended our target’s backend with a pass that produces metadata describing a system’s control flow. This allows branches and calls to be resolved with appropriate correction values. A particular challenge is dealing with function pointers and hence indirect transfers of control. We will also describe the implementation of user attributes to support such functionality in Clang. 

The encoding of each instruction, and the correction values cannot be finally determined until the final programs is linked. Using the metadata generated by LLVM, we can recreate the control flow graph for the entire program. From this, each instruction can be signed, and the correction values for each basic block inserted into the binary. 

We will finish with a demonstration of this system in action.

Speakers
avatar for Simon Cook

Simon Cook

Compiler Engineer, Embecosm


Friday November 4, 2016 4:45pm - 5:30pm
2 - Technical Talk (Rm LL21AB)

5:30pm

Closing
Speakers
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation


Friday November 4, 2016 5:30pm - 5:45pm
1 - General Session (Rm LL20ABC)