GSoC 2023 Week 4 - Experiments? Yes. Results? No.

This blog post is related to my GSoC 2023 Project.

This week, I had mainly spent my time trying out various little experiments to see if any of them made a difference on performance. Even though I got no real results out of it, I was trying to get more accomodated with the codebase and I learned a lot from it!

I also spent time on issue #33910, and I was able to get the code to run 50% faster just with one word and one comma added!

Here are the various experiments that I tried:

To be honest, these experiments were really quite helpful to me. While quite demoralizing, they helped me better understand how the LLVM data structures work and also how to use KCacheGrind more properly in terms of analyzing the emitted machine code and the source code.

These experiments also helped me run an experiment which gave unexpectedly good results, this I will cover in the post for next week.

Regarding issue #33910, the reason for the isKeyOf operation being so slow was not because of the ordering of the comparisons within it but instead because of the number of times it was called. I should have expected that some &&s and ==s would not have had a meaningful difference on performance, and instead it would’ve been a much better idea to reduce the number of times that function is called.

This was something I realized when I actually dove into the issue and measured the various equality operators and how often they were inequal. This gave a very interesting result: the comparisons were pretty much dominated by differences between the OffsetInBits metadata, and it was about 60x more likely to be inequal on average:

Times each attribute was inequal

Note that this is not even the full data generated, I stopped it when it had roughly as many rows as there are OffsetInBits inequalities, because at that point I had 1.2GiB of data and not enough RAM to load it all at once.

When I brought this up with my mentor, they told me that this is probably the cause of the quadratic behaviour, as the getHashValue function for DIDerivedType did not take into account the OffsetInBits parameter. The reason for so many inequalities was because the file provided by the issue (at least for the small case) has pretty much 20,000 lines of metadata with exactly two unique names: Va and Vb, with only the OffsetInBits differing between them. This hit the corner case of not including that value in the hash function, which caused an explosion in the collision chains and thus caused the compile time explosion.

As expected, changing the hash function to take OffsetInBits into account led to a huge reduction in compile time, from roughly 8 seconds to 4.3 seconds. This pretty much erased isKeyOf from the flamegraph, and the only bottleneck now is ReplaceableMetadataImpl::replaceAllUsesWith, which requires uniquing all metadata all over again each time it is called. This is extremely slow, and it makes sense to only unique things after the whole file is parsed.

As per my GSoC proposal, I was actually supposed to work on this during the midterm eval, so I will be pushing fixing this issue back while I focus on non-edge cases. I think the solution is simple enough that postponing it will be fine.