GSoC 2023 Week 13 and 14 - Small Experiments

This blog post is related to my GSoC 2023 Project.

These two weeks, I worked on some random experiments, of which 3 had useful results. These were mostly done on the branches perf/instcombine4 and perf/instcombine5. While the commits on both branches are functionally the same, the perf/instcombine5 branch (naturally) contains the better / more updated code which is far more likely to be made into patches.

On the perf/instcombine4 branch, I made the following commits:

On the perf/instcombine5 branch, I made a few changes to the last commit on feedback from my mentor, that being the creation of a new class which would lazily compute the KnownBits information instead of piping it through the various functions so that no new overloads would be needed. This class ended up taking on the name CachedBitsValue, which is still a work in progress. It is a fairly simple implementation, which stores a Value * and the KnownBits information associated with it. This information is computed as-required by calling getKnownBits, and the computed value is cached within the class.

Unfortunately, to allow implicit creation from a Value *, any parameter that accepted either a Value * or a const Value * needs to be updated to use a const CachedBitsConstValue & or a const CachedBitsNonConstValue &, to allow implicit creation of a temporary. What this means is that both the KnownBits member and the boolean member that signifies its presence need to be made mutable so that they can be modified from const objects. This is quite an ugly solution, but it comes from the fact that C++ doesn’t allow you to implicitly create an object that binds to a non-const l-value reference.

With this in place however, the code for passing through KnownBits information becomes quite clean. This is shown by the first commit below, where both of these commits are on the perf/instcombine5 branch:

I also tried reducing the number of computeKnownBits calls in simplifyICmpInst. I noticed that visitICmpInst was calling foldICmpUsingKnownBits, which computes known bits information. Before that however, simplifyICmpInst calls simplifyICmpWithZero which ends up calling computeKnownBits a few times, a redundant call which can be handled by reusing the information computed before in foldICmpUsingKnownBits. This is incidentally where the change to simplifyICmpWithZero comes from in the above commit using CachedBitsValue, however it unfortunately ended up not giving a measurable improvement at all. It also led to three test regressions, which led to me abandoning the patch.