Timeline for answer to Near constant time rotate that does not violate the standards by Mark B
Current License: CC BY-SA 3.0
Post Revisions
11 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Aug 17, 2015 at 20:47 | comment | added | Peter Cordes | This generates nice asm on clang 3.5 and later, but not on gcc (even 5.2) See my answer on this question, or a shorter version on stackoverflow.com/questions/776508/…. It's always branchless, but it's a sequence of several instructions. Godbolt: godbolt.org/g/wJBthc | |
| Jul 20, 2015 at 15:42 | comment | added | supercat |
...*best-case* time of the shift routine. It's possible the multi-word (e.g. long long) shift routines used by some compilers would need tweaking, but an implementation of x>>y compliant with the above requirement would be cheaper than an implementation of x>>(y-1)>>1 which had to comply with C standard behavior when y was precisely equal to the bit size of x.
|
|
| Jul 20, 2015 at 15:38 | comment | added | supercat |
@ThomasMatthews: What would make things easier to read would be to amend the C standard to specify that x>>y, when y is precisely equal to the bit size of x, must yield either x>>(y-1)>>1 or x, chosen in arbitrary fashion. The vast majority of non-hyper-modern C platforms would naturally comply with such a standard; the only ones I know of where compliance would add cost are either those which use a computed jump into a routine that does repeated one-byte-shifts (for platforms without an n-bit shift instruction); the additional cost there would be slight compared with even the...
|
|
| Jul 18, 2015 at 23:08 | vote | accept | jww | ||
| Oct 7, 2015 at 20:06 | |||||
| Jul 13, 2015 at 16:27 | comment | added | jww |
@MikeMB - I've got other code that stashes it away in a static const.
|
|
| Jul 13, 2015 at 16:25 | comment | added | MikeMB |
Just for readability I'd store sizeof(T)*8 in a local variable.
|
|
| Jul 13, 2015 at 16:25 | comment | added | jww | "... in conjunction with branch predictors..." - I think that assumes the branch predictors work as {expected|desired}. That may not be the case on low end processors and boards. Plus, the guys advancing these attacks are very clever, so I want to avoid giving them a toehold. | |
| Jul 13, 2015 at 16:18 | comment | added | dyp |
I'm not 100% sure if it's well-defined, but I think you can simplify that to -y % (sizeof(T)*8). It seems to be better optimized by clang++ and g++
|
|
| Jul 13, 2015 at 16:05 | comment | added | Lightness Races in Orbit | Yeah but portability! | |
| Jul 13, 2015 at 16:03 | comment | added | Thomas Matthews | This is where I believe a simple inline assembly instruction would be easier to read. :-) | |
| Jul 13, 2015 at 16:02 | history | answered | Mark B | CC BY-SA 3.0 |