1. Home
2. Questions
3. AI Assist
4. Tags
6. Challenges
7. Chat
8. Articles
9. Users
11. Companies
12. Collectives
13. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

Timeline for answer to Near constant time rotate that does not violate the standards by Mark B

Current License: CC BY-SA 3.0

Post Revisions

11 events

when toggle format	what		by	license	comment
Aug 17, 2015 at 20:47	comment	added	Peter Cordes		This generates nice asm on clang 3.5 and later, but not on gcc (even 5.2) See my answer on this question, or a shorter version on stackoverflow.com/questions/776508/…. It's always branchless, but it's a sequence of several instructions. Godbolt: godbolt.org/g/wJBthc
Jul 20, 2015 at 15:42	comment	added	supercat		...best-case time of the shift routine. It's possible the multi-word (e.g. `long long`) shift routines used by some compilers would need tweaking, but an implementation of `x>>y` compliant with the above requirement would be cheaper than an implementation of `x>>(y-1)>>1` which had to comply with C standard behavior when `y` was precisely equal to the bit size of `x`.
Jul 20, 2015 at 15:38	comment	added	supercat		@ThomasMatthews: What would make things easier to read would be to amend the C standard to specify that `x>>y`, when `y` is precisely equal to the bit size of `x`, must yield either `x>>(y-1)>>1` or `x`, chosen in arbitrary fashion. The vast majority of non-hyper-modern C platforms would naturally comply with such a standard; the only ones I know of where compliance would add cost are either those which use a computed jump into a routine that does repeated one-byte-shifts (for platforms without an n-bit shift instruction); the additional cost there would be slight compared with even the...
Jul 18, 2015 at 23:08	vote	accept	jww
Oct 7, 2015 at 20:06
Jul 13, 2015 at 16:27	comment	added	jww		@MikeMB - I've got other code that stashes it away in a `static const`.
Jul 13, 2015 at 16:25	comment	added	MikeMB		Just for readability I'd store `sizeof(T)*8` in a local variable.
Jul 13, 2015 at 16:25	comment	added	jww		"... in conjunction with branch predictors..." - I think that assumes the branch predictors work as {expected\|desired}. That may not be the case on low end processors and boards. Plus, the guys advancing these attacks are very clever, so I want to avoid giving them a toehold.
Jul 13, 2015 at 16:18	comment	added	dyp		I'm not 100% sure if it's well-defined, but I think you can simplify that to `-y % (sizeof(T)*8)`. It seems to be better optimized by clang++ and g++
Jul 13, 2015 at 16:05	comment	added	Lightness Races in Orbit		Yeah but portability!
Jul 13, 2015 at 16:03	comment	added	Thomas Matthews		This is where I believe a simple inline assembly instruction would be easier to read. :-)
Jul 13, 2015 at 16:02	history	answered	Mark B	CC BY-SA 3.0