GSoC 2023 Week 17 and 18 - Same Old, Same Old

This blog post is related to my GSoC 2023 Project.

These two weeks, I worked on finding some more random optimization patches and also fixed #88580.

Regarding #88580, the fix was quite simple: the problem was being caused by the fact that SROA was creating too many partitions for the alloca. This meant that for a single alloca, it was possible that tens of thousands of loads or stores could be generated, in which case a compile time explosion would occur in passes like DeadStoreElimination and MemCpyOpt.

The way that SROA works is that it takes as its input an alloca instruction, and then tries to split it into multiple (hopefully smaller) allocas. The terminology is that an alloca is split into “splices”, and these splices further consist of “partitions”. From what I understand, a partition is basically an indivisible unit of the alloca, for example a single (i.e. scalar) element.

So, to fix this, it was quite simple to just add a CLI option to limit the number of slices that alloca is allowed to generate (calculating partitions was harder and not really worth the effort, though it would be a more accurate measurement). This change landed here.

After this, I have gone back to trying to find more performance improvements. I have found some more, though progress has been really slow:

The InferAlignment patches should land soon, only the review for the last patch is left now. In the meantime, I will continue trying to find perf patches like these to hopefully get a cumulative speedup of 0.6% on -O3, which should (again, hopefully) get my project to a 3% overall speedup. I am also concurrently working on my final report at this point, and it has been very fun to go all-out designer mode on what is basically a word document :)