GSoC 2023 Week 9 - A Saturating Arithmetic And ICmp Fold

This blog post is related to my GSoC 2023 Project.

This week, I worked on a couple of optimizations in InstCombine: folds of the form u[add|sub]_sat(X, C) pred C2, where both C and C2 are constants. This fold takes on the general formula:

   u[add|sub]_sat(X, C) Pred C2
=> (OpWillWrap ? SaturatingValue : (X Binop C)) Pred C2
=>  OpWillWrap ? (SaturatingValue Pred C2) : ((X Binop C) Pred C2)

When the value of SaturatingValue Pred C2 is true, we get the following fold:

   OpWillWrap ?  true : ((X Binop C) Pred C2)
=> OpWillWrap || ((X Binop C) Pred C2

When the value is false, we get the following fold:

    OpWillWrap ?  false : ((X Binop C) Pred C2)
=> !OpWillWrap ?  ((X Binop C) Pred C2 : false
=> !OpWillWrap && ((X Binop C) Pred C2

It is usually possible to combine these two checks into just one comparison. For the two intrinsics currently implemented, the OpWillWrap checks look like the following:

While the usub_sat check is quite obvious, I found the uadd_sat one is not, so here is a little derivation:

   (X + C >= UINT_MAX)
=> (X     >= UINT_MAX - C)
=> (X     >= (UINT_MAX + ~C + 1)) [as -C == (~C + 1)]
=> (X     >= (~C + 0))            [as (UINT_MAX + 1) == 0]
=> (X     >= ~C)

This is a really neat way to check the condition without causing overflow through the addition.

It is possible to combine the value ranges for the two operands above, to yield just one icmp for the overall expression. This can be done by noting the following:

Thus, I created the following two patches for the fold:

The reason that these patches took so long to make is because it took me quite a while to understand the semantics of the fold itself. What helped me a lot was working out the simplifications on paper, then transferring it into code. I got really stuck on the || vs. && simplification, and then generalizing it to have the same general formula for both also took some time. Overall, the patch for usub_sat took 5 days, whereas the one for uadd_sat took 3 days. However, I learnt a fair bit about writing proofs on Alive2 from this, and also learnt that its probably a good idea to mathematically verify that a transform actually works before transferring it into code and just chucking it on phabricator for review :).