Wednesday, August 20, 2014

Devirtualization in C++, part 6 - asking user for help

In previous posts of the series I described most of the new GCC devirtualization machinery. To cut the story short, GCC implements two kinds of devirtualization: the full and speculative one.

Full devirtualization replaces given polymorphic by a direct call (that can get inlined later).  Speculative devirtualization is a weaker form turning polymorphic call to a conditional testing whether the target is the predicted one and going by direct call path if it happens to be so and doing the indirect call otherwise.

Whole performing speculative devirtualization the compiler can play somehwat unsafe and make assumptions that are not valid 100% of cases. The main assumption made is that there are no new derived types introduced by code not visible to compiler. With link time optimization this is usually true, but not necessarily so - new derived types can be introduced by libraries or plugins. While compiler can generally attempt to track what instances may not be introduced by external code, this analysis is difficult and it is a question whether it can be implemented with reasonable precision within statically optimizing compiler.

The following example shows speculative devirtualization:
struct A {virtual void foo() {}};
void test (struct A *a)
{
  a->foo();
}
 Here GCC 4.9 produces:
test(A*):
        movq    (%rdi), %rax
        movq    (%rax), %rax
        cmpq    $a::foo(), %rax
        jne     .L5
        ret
.L5:
        jmp     *%rax
Instead of:
test(A*):
        .cfi_startproc
        movq    (%rdi), %rax
        movq    (%rax), %rax
        jmp     *%rax
If the target of polymorphic call happens to be a::foo(), the speculatively devirtualized code is significantly faster saving the call overhead (and enabling further optimizations).