GSoC 2023 Week 6 - RegAlloc Experiments

23 Jun 2023

This blog post is related to my GSoC 2023 Project.

This week, I spent most of my time just analyzing code. I was digging mainly through profiles of RegAllocGreedy, and I was trying to find areas to improve. While I wasn’t able to do much in general, I was able to get one useful patch out of it: https://reviews.llvm.org/D152781.

While going through the profiles, I noticed that a fair bit of time was being spent in the function BlockFrequency::operator+=. I initially tried to see if I could optimize it, but there was not much to be done there. Then I noticed that it wasn’t being inlined, so I asked around and got to know that LLVM in general is not built using LTO. As this function is called quite a lot, such a simple thing not being inlined led to an unnecessary overhead. By moving this method into the header so that it could be inlined everywhere, a 0.1% improvement was gained.

I also tried a few other experiments, which led to no useful gains:

Trying to speed up BitVector::reference::operator= - I noticed that this function was using a branch where it wasn’t necessary, however its probably not used often enough to matter.
Replacing array_pod_sort with std::sort - array_pod_sort is used in the LLVM codebase as it is not a template and instead uses qsort as its sorting algorithm. This gives a good binary size reduction as it reduces bloat caused by template instantiations. However, std::sort is a fair bit faster, and it gave a meaningful improvement in this case but not enough to warrant a patch.
Changing a couple of container sizes - This is my favourite kind of experiment, cause it is super easy and usually gives results. In this case, making the containers’ small sizes just a bit bigger gives a slightly measurable result though it is actually not really distinguishable from noise.
Adding LLVM_LIKELY and LLVM_UNLIKELY - I didn’t expect this to have any real results, and it didn’t. I noticed that a fair number of jumps were being taken when the condition was false on these conditions, which could have led to an issue with jumping to instructions which are out of cache. However, there is no such issue in reality because the jumps are likely too small to matter, so there was no measurable difference.

I also looked for a long time into RegAllocFast, however it is a very small file and very well optimized, so I couldn’t find anything to optimize in there. Similarly, FastISel also didn’t seem to have too much to optimize in it either, though I did certainly try :).

dc03

GSoC 2023 Week 6 - RegAlloc Experiments

Related Posts

GSoC 2023 Week 19 and 20 - That's All, Folks! 29 Sep 2023

GSoC 2023 Week 17 and 18 - Same Old, Same Old 10 Sep 2023

GSoC 2023 Week 15 and 16 29 Aug 2023

GSoC 2023 Week 19 and 20 - That's All, Folks!
29 Sep 2023

GSoC 2023 Week 17 and 18 - Same Old, Same Old
10 Sep 2023

GSoC 2023 Week 15 and 16
29 Aug 2023