Andrey Vul wrote: > There is also a branch-free way, but it requires limits.h
> (q + r + ((q + r) & INT_MAX))>>1
I don't see that requiring limits.h is a problem. However, this solution can overflow which is undefined behaviour (i.e. anything can happen) and implementation defined behaviour if q and r are negative, so there is no guarantee it will work. -- Flash Gordon
>There is also a branch-free way, but it requires limits.h
>(q + r + ((q + r) & INT_MAX))>>1
Plus it suffers if the addition overflows.
I like the conditional variant because it takes me back to the good old days, where -1 was true in C64 BASIC. :)
Now, to go OT:
Some timings, FWIW, with gcc 4.4.2 -O2, AMD Athlon64, 100 million iterations (calculate the total of the maximum of two pseudo-random numbers), runtime affected by the loading whims of my PC:
code runtime ------------------------------------------ ------- #define MYMAX1(a,b) ((a)<(b)?(b):(a)) 3.311 if (a < b) return b; else return a; 3.340 return ((a>b)*a)+((a<=b)*b); 3.328 return (a + b + ((a + b) & INT_MAX))>>1; 3.339
The first two generated identical assembly, unsurprisingly. They make use of the "conditional move" instruction, so the assembly for computing the max is basically:
CMP a,b CMOV a,b
None of the assembly produced in any of the four tests had any explicit branches--it was all compares and conditional instructions.
I think the only lesson to be drawn from the timings is, "Sometimes optimizations don't do what you think they'll do." When it comes to little things like this, I'm a big fan of just coding it straight and then optimizing if its required.
Perhaps on a RISCier machine (one without Intel's DOIT instruction ;) ), the differences would be... existent.
Andrey Vul <andrey....@gmail.com> writes: > Here is a ternary-free way of implementing max(q,r) > int max(int q,int r) {return ((q>r)*q)+((q<=r)*r) ;}
Your C expression is more difficult for me to understand than (q > r) ? q : r, and I doubt the generated code is faster either. The ?: version also works with pointers (to elements of the same array), which cannot be multiplied.
This kind of Boolean*integer multiplication made some sense in old BASIC programs, but I don't see a purpose for it in C, where we have ?: available. If you were implementing your own C compiler and hadn't implemented ?: in the parser yet, I think it would be rather simple to implement max(q,r) as a built-in function. (A standard-conforming compiler would have to call it e.g. __max but a compiler without ?: wouldn't conform anyway.)
On Sat, 7 Nov 2009 14:46:15 -0800 (PST), Andrey Vul
<andrey....@gmail.com> wrote: > On Nov 7, 5:31 pm, Eric Sosman <esos...@ieee-dot-org.invalid> wrote: > > Andrey Vul wrote: > > > There is also a branch-free way, but it requires limits.h
> > > (q + r + ((q - r) & INT_MAX))>>1
> > Obviously broken on overflow, but also broken on some > > perfectly ordinary cases. q==1, r==-1, for example. > Sorry, mixed + and -.
Only on sign&magnitude machines, which no longer exist in any practical sense. Still broken on everything else. See below.
> Is there a non-overflowing way to do abs() as a one-liner?
> Branch-free is from math class: > max(a,b) = $ \{ a + b + | a - b|} \over {2} $
With abs this works. But there is no standard computational (conceptually branchless) operation that directly does abs.
What you can do is a + (a<b)*(b-a), and on *most* 2sC machines, although not required by the standard, in the absence of overflow this can be done by: a + -((a-b)>>INTbits) * (b-a) or slightly 'simpler': a + ( (a-b)>>INTbits & (b-a) ) where usually INTbits is sizeof(INT)*CHAR_BIT-1 (unless padding bits are used, rarely if ever).
For abs start with a*( 1-(a<0)*2 ) and proceed similarly.
That said, I concur with the sentiment elsethread to just write the ternary operator and let the compiler generate the code; if it screws up such a basic thing, you need a better compiler anyway.